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Relation of the Mode of Prosthesis Control to Psychomotor 
Performance of Arm Amputees ' 


Hilde Groth and John Lyman 


University of California, Los Angeles 


The replacement of some degree of func- 
tional manual skill for arm amputees requires 
that muscle groups in unimpaired parts of the 
body be used to control and power the forces 
and movements available to the amputee 
with his prosthesis. In current practice the 
various types of hooks and artificial hands 
which are fitted to the amputee can be op- 
erated for prehension by pulling a cable (4, 
6, 7). The source of control and power is 
either a strap across the back or a pin in- 
serted through a tunnel in the biceps or pec- 
toral muscles which can be attached to the 
cable. Examples of both shoulder harness 
and cineplastic muscle-motor prosthesis, uti- 
lizing the strap and muscle-pin arrangements, 
respectively, are shown in Fig. 1 (1, 8). 

The prehensile devices, hooks or hands, are 
commonly called “terminal devices” and will 
be referred to as “TDs” hereafter. These 
TDs can be differeniated on the basis of two 
different principles of operation, namely 
whether they are “voluntary opening’ or 
“voluntary closing” mechanisms. “Voluntary 
opening” (VO) TDs are those for which a 
pull on the cable opens the device from a 
closed resting position. Such devices close on 
objects by spring force when the amputee re- 
leases the cable tension. “Voluntary closing”’ 
(VC) TDs work on the principle that the 
amputee must pull on the cable to close the 
device from an open position. 


1 This work was supported by Contract No. VAm- 
23110 between the Veterans Administration and the 
Department of Engineering, University of California, 
Los Angeles. The opinions expressed are those of 
the authors and do not necessarily represent those of 
the Veterans Administration. 


A survey of the prosthetic literature re- 
veals that there are no valid criteria available 
for choosing either principle in preference to 
the other for prescription. Several writers 
have expressed opinions in terms of “logical” 
reasons and the lore has developed that the 
VC principle is preferable on the grounds that 
it is more compatible with normal neuromus- 


CINEPLASTY CONTRO. HARNESS CONTRO: 


Fic. 1. Two types of control harness for below- 


elbow arm amputees. 


cular patterns and that it permits grading of 
prehension force (2, 3, 5). Since the mecha- 
nisms for the two types of device differ con- 
siderably in complexity, with the VO TD be- 
ing inherently the simpler device, both eco- 
nomic and functional factors have made it 
seem desirable to determine which, if either, 
of the two principles could be validated in 
terms of superior function for the amputee. 
The purpose of this investigation was to make 
performance comparisons under standard lab- 
oratory conditions as a step toward establish- 
ing criteria for future TD designs for pre- 
scription. The specific hypotheses tested were 
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as follows: (a) There is no inherent perform- 
ance superiority in either the VO or VC prin- 
ciple. (4) Performance time will be related 
to the amount of attention required for suc- 
cessful operation of the device. 


Procedure 
Subjects 


Ten unilateral below-elbow amputees, five uni- 
lateral above-elbow amputees, and two bilateral am- 
putees were recruited by mail and telephone from 
the amputee file of the UCLA Artificial Limbs Proj- 
ect. All Ss were familiar with the operation of the 
two TD mechanisms and no training was necessary. 
Since the nature of the sample did not permit the 
customary matching of Ss on variables of age, sex, 
and education, the following criteria were used for 
inclusion in the sample: 

1. The Ss had to function adequately in their jobs 
and earn a living. 

2. They should be free of pronounced signs of erno- 
tional maladjustments related to amputation. 

3. They should show interest and motivation for 
participating in the experiment. — 











Fic. 2. 


Apparatus and Tests 


Two commercially available types of TD were 
chosen that are nearly identical except for their op- 
erating principles. The VO principle was represented 
by Northrop-Sierra two-load hooks and hands and 
the VC principle by Army Prosthetics Research Lab- 
oratory (APRL) hooks and hands. The VC hooks 
are equipped with a locking mechanism which re- 
quires an extra motion (ie., pull to unlock). In or- 
der to get direct comparison with the VO principle, 
this locking mechanism was removed on some of the 
VC TDs used in the tests. Some pertinent charac- 
teristics of the TDs are shown in Table 1. It will 


Table 1 
Physical Characteristics of TD 








Spring 
Force 


TD (Ibs.) 





VO hook 10.0 


VC hook ‘ 6.5 


12.0 
12.0 


VO hand 
VC hand 











Test layout and types of terminal devices used. 
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be noted that the spring force in the VO hook is 
3.5 lbs. greater than in the VC hook, a fact which 
might bias the results against the VO hook when 
comparing the criterion measures for the two 
principles. 

To help minimize practice effects three simple 
manipulation tests were used which required only the 
motion: of grasp, transport, and release of the ob- 
ject. The three tests were the Minnesota Rate of 
Manipulation Placing Test (MRMT), the Table Set- 
ting Test, and the Cup Test. The Table Setting 
Test and the Cup Test were included in order to in- 
crease the motivation of the Ss by including tasks 
which were somewhat “realistic” to everyday living 
and to establish a more general validity for the re- 
sults by introducing a variety of shapes and weights, 
supplementing the MRMT. Performance time for 
each test was taken with a stop watch and recorded 
to .1 second. 

Test-retest reliabilities for these tests were estab- 
lished on a sample of 30 engineering students. The 
correlation coefficients for the various treatments 
ranged from r=.78 to r= .91. 

Figure 2 shows a general view of the tests and the 
work space. 


Routine 


After each S was thoroughly familiarized with the 
tests, each test was administered individually in a sub- 
ject-by-treatment randomized block design. Treat- 
ments for each group of amputees were as follows: 

Unilateral BE amputees.—Intact hand and five TD 
conditions including a VO hook, VC nonlock hook, 
VC lock hook, VO hand, and VC hand. Two pos- 
tural conditions, sitting and standing, were used in 
order to indicate the effect of the relative height of 
the work surface. One TD was tested each week. 
Only eight of the Ss participated in the artificial 
hand conditions as the other two had never worn 
hands. 

Unilateral AE amputees—-Same TDs as the BE 
unilateral amputees, but tests were administered in 
the standing condition only. The Cup Test was 
omitted because it required a degree of arm steadi- 
ness which the Ss could not achieve. Tests were ad- 
ministered in two sessions at least one week apart. 

Bilateral amputees.—Only the three hook TDs were 
used as the Ss never wore artificial hands. Tests 
were administered in a single session and were the 
same as for the AE amputees. 

In order to compare the laboratory results with 
amputee preferences for a certain TD, a question- 
naire listing 25 simple manipulative motions from 
daily life (e.g., holding a fork when eating) was ad- 
ministered to each S. 


Calculations 


To minimize the assumptions required for statisti- 
cal testing, the significance of differences among the 
various treatment conditions was tested with the 
nonparametric Signed Rank Test for Paired Observa- 
tions (9). 


Results * 
Unilateral BE Amputees 


Comparison of the differences between per- 
formance times with the VO and VC nonlock 
hooks was not significant for any of the treat- 
ment conditions, but the difference between 
VO and VC lock hooks was significant beyond 
the .01 level. Performance with VC locking 
devices was considerably slower than with 
the VO devices on all tests. Figure 3 is a 
graphical presentation of these results for the 
MRMT standing condition. This result is 
representative of the results from the other 
tests. 

Tests for the influence of posture on the 
performance scores showed that this factor is 



































MAA 


Results for Minnesota Rate of Manipula- 
tion Test 





Fic. 3. 


2 The statistical tables have been deposited with 
the American Documentation Institute. Order Docu- 
ment No. 5128 from ADI Auxiliary Publications 
Project, Photoduplication Service, Library of Con- 
gress, Washington 25, D. C.,, remitting in advance 
$1.25 for microfilm or $1.25 for photocopies. Make 
checks payable to Chief, Photoduplication Service, 
Library of Congress 
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of no importance in VO versus VC compari- 
sons. The performance in the sitting position 
was significantly slower in only one case. Ac- 
cording to Wilkinson (10) the probability 
that this result was obtained by chance in a 
group of such measurements is p = .14, sug- 
gesting that sampling fluctuations may have 
been responsible. 

Performance times for the normal hand of 
the unilateral amputee subjects were slightly 
greater and more variable than those obtained 
for the normative sample. Table 2 summa- 
rizes the mean performance times and the 
standard deviations of the BE amputees for 
the various treatment conditions. 


Unilateral AE Amputees 


Results for the five AE amputees are shown 
in Table 3. The trend found for the BE am- 
putee sample was confirmed by the AE re- 
sults. There was no significant difference 
when comparing the VO with VC nonlock 
hook, but comparison between the VO and 
VC locking devices indicated that perform- 
ance with the latter was significantly slower. 


Bilateral Amputees 


Since only two Ss were available, no sta- 
tistical inferences could be made. The re- 
sults for each amputee again confirmed the 
trend found with the other two samples how- 
ever. 


Questionnaire 


Assessment of the questionnaires indicated 
each amputee’s choice of either a VO or a VC 
device for a specified manipulatory activity in 
everyday life and gave the following over-all 
result: In a total of 418 responses, 76.6% 
(299) of the choices were for the VO prin- 
ciple and 28.4% (119) of the choices were 
for the VC principle. 


Discussion 
Fletcher has expressed the opinion that VO 
devices are “. . . in functional principle, the 
direct opposite of the normal action of pre- 
hension to which the amputee has been ac- 
customed in his natural arm and hand’ (5, 


p. 224). In contrast, VC devices are said to 
approximate the normal function and, there- 


fore, should be given preference in routine 
prescription. It is the belief of the writers 
that even strictly physiological considerations 
would hardly justify labeling the VO principle 
as the direct opposite of normal prehension. 
Elimination of the sensory control of some 
part of the normal movement would be ex- 
pected to have a profound effect on the move- 
ment regardless of mode of control. In ad- 
dition, neither opening nor closing stands in 
any natural functional relationship to the 
shoulder shrug, biceps or pectoral contraction. 
These considerations suggest that any attempt 
to obtain “normal” neuromuscular control for 
a device in which muscle power is applied 
from muscles not previously associated with 
manual function is not a reasonable expecta- 
tion. At best, the operation of the device can 
be learned so that useful function results, but 
this learning cannot, in the writers’ opinion, 
be interpreted as conforming with normal 
physiological patterns. At the present state 
of the art a prosthesis must be considered a 
tool and not a replacement of “normal” func- 
tion. 

Our results clearly indicate that the mode 
of TD control is unrelated to the criterion of 
performance time. Time increases were only 
observed with the locking devices which force 
the amputee to divide his attention between 
the task and the operations necessary to make 
the device perform successfully. The VC de- 
vice, as it is commercially available, hardly 
permits the connotation “automatic locking 
device” because without constant attention of 
the amputee to the various phases of the mo- 
tion cycle, locking will not be achieved and 
the object will be dropped. 

The results of the questionnaires also cor- 
roborate our experimental findings in com- 
paring the VO and VC locking devices in cur- 
rent use, and can serve as a somewhat crude 
validity check for the criterion employed. 
This becomes especially apparent in the light 
of comments which the amputees made dur- 
ing the course of the testing indicating that 
preference for a certain device is related to one 
or more of the following factors in addition 
to performance efficiency in terms of time: 
habit, least effort during operation, feeling of 
security during operation, reliability of mecha- 
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nism, cost of upkeep, and a variety of per- 
sonality factors. 

From a theoretical viewpoint, this study 
points out the danger of using a stereotyped 
opinion as the basis for equipment develop- 
ment for a man-machine system. It seems 
“normal” to an individual with two intact 
hands that the act of grasping is an act of 
“voluntarily closing” the fingers around the 
object. However, in this context the opening 
of the hand must also be considered “volun- 
tary.” In normal hand function both opera- 
tions are under closely integrated cortical con- 
trol. If only part of this control can be main- 
tained for the prosthesis, as is the case, the 
degree to which the part is effective can be 
assessed only by empirical testing and not by 
“logical” reasoning. The experimental evi- 
dence from the present study strongly sug- 
gests that the VO hook or hand would be the 
more preferable device for routine prescrip- 
tion when time of performance and simplicity 
of mechanism are used as criteria. This evi- 
dence is not complete, however, for there are 
two more pertinent arguments which have 
been brought up in favor of the VC mecha- 
nism (4, 5): (a) It has variable prehension 
force, permitting the amputee to adjust to the 
characteristics of the object. (6) It provides 
a wider range of forces without modification 
of the mechanism. A final evaluation of the 
VO versus VC principle in prosthetic termi- 
nal devices has to be postponed until investi- 
gations currently in progress indicate whether 
these arguments are valid reasons for justify- 
ing the increased design complexity of VC 
devices. 


Summary 

The major purpose of this investigation was 
to test the hypothesis that a “voluntary clos- 
ing” type of prosthesis control system will 
lead to superior psychomotor performance for 
arm amputees when compared with a “volun- 
tary opening” type of system because of the 
closer “imitation of natural function” said to 
be provided by the former system. 


Comparisons of both types of control sys- 
tems for hook and hand prosthetic terminal 
devices were made for a total of 17 ampu- 
tees, using performance time as a criterion 
measure on three simple manipulation tests. 
Amputee preference for various types of de- 
vice was determined by a questionnaire. 

The results indicated that the mode of con- 
trol of the prehension device was unrelated to 
the criterion measure and that amputee pref- 
erence for a device is related to other factors 
than speed of performance, such as mechani- 
cal reliability. 

The conclusion based on time measure- 
ments is that there is no inherent superiority 
for either type of control; neither one stands 
in any natural functional relationship to the 
shoulder shrug or biceps contraction. 


Received August 1, 1956. 
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Validity Indices for the Heston Personal Adjustment 
Inventory 


Donavon Auble 


Western College for Women 


The value of a student’s college experience 
relates very directly to the success he achieves 
in making the transition from secondary 
school to college. It is common knowledge 
that this transition is not uniformly easy for 
students. Valid information relative to cer- 
tain personality dimensions of matriculants is 
of considerable value to all persons con- 
cerned with helping students “get used to 
college.” The purpose of this study was to 
appraise the utility of the Heston Personal 
Adjustment Inventory as an instrument for 
use in this connection. 


Description of Variables and Samples 


The Personal Adjustment Inventory (3) is 
composed of 270 questions to which the in- 
dividual is asked to answer “yes” or “no.” 
The inventory does not purport to give a final 
assessment of personality, but it does aim to 
provide a convenient preliminary summariza- 
tion of six traits thought to be important: 
analytical thinking, sociability, emotional sta- 
bility, confidence, personal relations, and home 
satisfaction. It is convenient to administer 
and score and can be interpreted by a per- 
sonnel worker without extensive training in 
clinical psychology. Although the Heston In- 
ventory has been rated as a better than av- 
erage instrument of its type (1), there is a 
scant amount of published research bearing 
on its utility (2, 4). 

During orientation week, September, 1954, 
all freshmen from the United States entering 
Western College for Women were given the 
Personal Adjustment Inventory. The test 
was administered in a group session in ac- 
cordance with instructions in the manual. 
Five students who omitted too many items to 
yield usable scores were contacted individu- 
ally and asked to complete the inventory ap- 
proximately two weeks after the first adminis- 
tration. The results of this testing were not 
released to anyone save the writer who did 


not participate in the subsequent phase of 
the study or otherwise knowingly act on this 
information. The sample group originally 
consisted of 88 women. However, two of 
these students withdrew from school very 
early in the academic year and were not in- 
cluded in the validation study. 

After the Christmas recess, questionnaires 
were distributed to persons who had oppor- 
tunity to know the members of the sample 
group. Faculty members, dormitory heads, 
and senior counselors were asked to rate the 
girls they knew fairly well on each of the six 
personality traits included in the Heston Per- 
sonal Adjustment Inventory. To eliminate 
ambiguity arising from the trait labels, a 
descriptive paragraph from the manual defin- 
ing the traits was reproduced in the question- 
naire. A five-point scale was employed (1— 
exhibits this trait to a very low degree, 2— 
below average in regard to this trait, etc.). 
Returns were received from 54 (97%) of the 
persons asked to participate. The arithmetic 
means of the ratings received on each of the 
traits by a given student served as the cri- 
terion variables. In addition the raters were 
asked to check if they had observed certain 
types of behavior believed to be symptomatic 
of difficulty in making a satisfactory adjust- 
ment to college life. These behavioral cate- 
gories included items such as: seemed de- 
spondent, been homesick, seemed unduly 
fatigued, been touchy, and violated campus 
regulations. Since Western College is a small 
residential college with extensive opportunity 
for faculty-student contact, the ratings ob- 
tained were thought to have more than super- 
ficial relevance. 


Statistics and Validity Data 


A summary of the rating data is displayed 
in Table 1. It should be noted that raters 
were asked to express judgments only when 
they felt qualified on the basis of their fa- 
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Table 1 
Ratings by Faculty and Counselors of Western College 
Freshmen on Six Personality Traits 


Median 
No. of 
Raters/ 
Student 
Analytical Thinking 3.2 33 8.1 
Sociability 3.3 .22 7A 
Emotional Stability 3.3 = 7.5 
Conlidence 3.3 .28 7.8 
Personal Relations 3.5 26 5.8 
Home Satisfaction 3.7 34 2.7 


Inventory Trait Mean SD 


miliarity with the individual to do so. Rela- 
tively few of the raters submitted ratings for 
the trait labeled “Home Satisfaction.” 
Correlations between inventory scores and 
ratings are shown in Table 2. The reliability 
of each of the ratings was estimated by the 
split-halves technique and the Spearman- 
Brown prophecy formula. These estimates 
were used to correct the raw validity indices 
for attenuation in the criterion variable. 
Chi-square analyses were used to compare 
trait scores with ratings assigned by the single 
person most familiar with the members of the 
sample group, the Dean of Students. None 
of the six computed chi-square values were 
significant at the 5 per cent level, indicating 


Table 2 
Correlation Between Heston Personal Adjustment 
Inventory Scores and Mean Trait Ratings 
by Faculty and Counselors 
(N = 86) 


r Corrected 
for 

Attenuation 
in the 

Inventory Trait Rawr Criterion 
-Analytical Thinking 35 38 
Sociability 34 52 
Emotional Stability 18 21 
Confidence .O8 10 
Personal Relations 07 09 
Home Satisfaction 03 20 

(N =83)* 





* Three cases omitted due to insufficiency of rating data. 


Table 3 


Comparison of Heston Personal Adjustment Inventory 
Trait Scores for Women Students Evidencing Most 
Difficulty in Adjusting to College (DIFF) and 
for Other Women Students (NONDIFF) 


————,; —o 


Mean SD t 


Inventory Trait N 


A—Analytical Thinking 

DIFF 

NONDIFF 
S—Sociability 

DIFF 

NONDIFF 
E—Emotional Stability 

DIFF 

NONDIFF 

“—Confidence 

DIFF 

NONDIFF 
P—Personal Relations 

DIFF 

NONDIFF 
H—Home Satisfaction 

DIFF 

NONDIFF 


22.58 
28.48 


36.48 ss 
27.46 — 3 
27.68 
29.02 


30.36 
31.48 


30.32 
26.42 


35.34 
31.43 





a lack of agreement between inventory scores 
and this individual’s ratings. 

Even if the validity coefficients for the to- 
tal group were relatively low, a personality 
assessment inventory would be quite useful if 
it could identify in advance those individuals 
who would have particular difficulty in ad- 
justing to college life. This use of the Heston 
Personal Adjustment Inventory was investi- 
gated. Trait scores for the 10 per cent of 
cases displaying the largest number of symp- 
toms of difficulty in adjusting to college were 
compared with the scores of the remainder of 
the sample.group. Student’s ¢ test for un- 
correlated measures yielded the statistics of 
Table 3. None of these values are significant 
at the 5 per cent level. 

Similar results obtained for the two stu- 
dents who voluntarily withdrew from school 
within the first few weeks of the semester and 
for the single girl asked to withdraw at mid- 
year for failure to attend classes, i.e., the 
scores of these students on the Heston In- 
ventory were not significantly different from 
the scores of other students. 





Validity Indices for the Heston Personal Adjustment Inventory 


Conclusions 


The correlation coefficients reported here, 
indicative of association between trait scores 
and faculty ratings are considerably lower 
than similar indices reported from other stud- 
ies cited in the Heston Personal Adjustment 
Inventory Manual (3). Further, the inven- 
tory failed to discriminate between students 
who seemed to have considerable difficulty 
adjusting to college and those who did not. 
On the basis of the evidence in this study one 
must conclude that the Heston Personal Ad- 
justment Inventory is of limited practical 
value for predicting adjustment to college 
life. 
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References 


. Buros, O. K. The fourth mental measurements 


yearbook. Highland Park, N. J.: Gryphon, 
1953. Pp. 97-100. 


. Heston, J. C. A comparison of four masculinity- 


femininity scales. Educ. psychol. Measmt, 
1948, 8, 375-387. 


. Heston, J. C. Manual, Heston Personal Adjust- 


ment Inventory. New York: World Book 
Co,, 149. 


. Michaelis, J. U., & Tyler, F. T. Diagnostic and 


predictive value of the Heston Inventory used 
in student teaching. J. teacher Educ., 1950, 
1, 40-43. 








Journal 


of Applied Psychology 
Vol. 41, 44 f 1957 . 


Attention Value as a Function of Illuminant Color Change’ 


Albert E. Bartz 


North Dakota University 


As more electronic innovations are added 
to the design of aircraft the man-machine re- 
lationships become more involved and the 
presentation of information to the operator 
becomes increasingly complex. 

Aircraft instruments serve for three basic 
types of reading: (a) check reading for the 
presence of a deviation; (b) qualitative read- 
ing for the meaning of a deviation; and (c) 
quantitative reading for the numerical value 
of a deviation (6). This paper is concerned 
with functions of the first type and employs 
the rotating type indicator. 

Many experiments have been conducted re- 
cently concerning check reading with different 
dial configurations. These experiments have 
shown that the best configuration with 16 
dials is four banks of four dials each with 
normal operating position employed in a 9:00 
symmetrical presentation (9), as shown in 
the left panel of Fig. 1. This design pre- 
sents an optimal condition in detecting devia- 
tions in multiengine aircraft. 

A basic problem then is a consideration of 
how this check reading may be facilitated 
in terms of speed and accuracy. This study 
is concerned with the use of an illuminant 
color change to aid the operator in his check 
reading. 


Method 


Experimental procedure. ‘Two situations were com- 
pared. In one, the pointers deviated while the color 
of the illuminant remained the same (R situation). 
In the other (RG situation), deviating pointers in 
the error dials were accompanied by a change from 
the normal illuminance of red (1) to an illuminance 
of green. 

An error dial in the R situation was defined as a 
pointer displacement from the normal 9:00 position 
to 30° above normal, or approximately 10:00. Dur- 
ing this procedure the illuminant remained red. 


! This paper is a summary of a thesis submitted to 
the Graduate Faculty of North Dakota University 
in partial fulfillment of the requirements of the M.A. 
degree. The author wishes to express his gratitude to 
Dr. Herman F. Buegel and Dr. Kermit J. Rohde for 
their encouragement and assistance in this project. 
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An error dial in the RG situation consisted of a 
deviating pointer plus an illuminant color change. 

Apparatus. A simulation of actual equipment was 
used instead of a tachistoscopic presentation, since 
some authorities believe that the latter presentation 
relies heavily upon afterimages (4). 

The apparatus consisted of an instrument panel, a 
relay rack, and a timer. The instrument panel was 
an 18-inch square of plywood with four banks of 
four dials each as shown in Fig. 1. The dials were 
of uniform size, 2.75 inches in diameter (3), and 
were black with white pointers and white interval 
markings (2). 
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Fic. 1. The instrument panel. The presentation 


on the left is normal. In the configuration on the 
right, dials 2, 6, 10, and 14 are in error. 

















The pointers could be made to deviate by a sole- 
noid arrangement on the back of each dial which 
turned the pointer to a position 30 degrees above 
normal. The graduation mark dimensions were .22 
* 035 inch for the main points and .16 X .030 inch 
for the intermediary marks (1). The pointers were 
.094 inch wide and just long enough to touch the 
smallest graduation mark (8). 

The color of the dial illuminant in aircraft is red 
for purposes of dark adaptation. Each dial was 
supplied with a red illuminant for the normal op- 
erating position and a green illuminant for the de- 
viating position. The brightness of each dial was in 
the range from 0.02 to 1.0 ft.-l., and was controlled 
by a rheostat (1). 

Through a relay circuit any number and combina- 
tion of dials could be made to remain red when nor- 
mal, remain red when deviating, or change from red 
to green when deviating. 

Physical conditions. The testing was conducted in 
a lightproof room. The lighting on the instrument 
panel was from the individual dial illuminants only. 
Enough light was scattered over the face of the ap- 
paratus to prevent the front panel from having zero 
brightness (1). The panel was placed 28 inches 
from S, perpendicular to his line of sight (1). A 
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push-button switch was attached to the arm rest of 
S’s chair. 

The sample. The sample consisted of 64 students, 
18 women and 46 men, enrolled in introductory psy- 
chology courses at the university. The ages ranged 
from 18 through 26. Prior to testing, each student 
was checked for evidence of color blindness with the 
Dvorine Color Plates. 

Test procedure. After each S had taken the 
Dvorine test, he was instructed to wear a pair of 
dark red goggles for 30 minutes for purposes of 
dark-adaptation (7). 

After being admitted to the testing room, each S 
was told that he would be presented with a series of 
deviating dials. There would be different numbers 
and different combinations of dials in an error po- 
sition. The S was shown what constituted an error 
dial in his first situation. The S was to scan the 
panel when the configuration was given, and to press 
his switch as soon as he felt sure that he knew the 
number of error dials. This returned the dials to 
normal. Then S was to report the number of error 
dials, and reset his switch for the next presentation. 
The E warned S five seconds before each presenta- 
tion. After eight configurations of one situation, S 
was shown what constituted an error dial for the 
next series of eight under the other situation. The 
S was presented with both situations in one sitting. 
The order of the two situations presented to each S 
was randomized to nullify practice effects. 

From one through eight error dials were presented. 
The order, number, and location of the error dials 
were randomized. The 1,024 presentations (eight 
under R, eight under RG for each S) were matched 
in the R and RG situations. That is, any given 
configuration occurring in one S’s R situation was 
matched by the same configuration in another S’s 
RG situation. This was necessary because it is con- 
ceivable that any two configurations containing the 
same number of error dials might not be of the 
same difficulty. 

Method of scoring. A scoring sheet was prepared 
for each S§ listing dial patterns, number of errors, 
and response times. When S pressed his switch, it 
stopped the timer and the response time for each 
configuration was recorded. Response times for each 
of the eight configurations for both situations were 
totaled to give two scores for each S. If S re- 
ported an incorrect number of dials in any con- 
figuration it was recorded as an error. 


Results 


The RG situation was responded to faster 
and more accurately than the R situation. 
The variability of response also was signifi- 
cantly less for the RG situation. 

A preliminary analysis of variance showed 
no significant difference in the reaction times 
between males and females for either the R 
or RG situation. 


Table 1 


Means, Standard Deviations, and Standard Errors 
for the R-and RG Situations 


R RG 


Mean 24.06 12.59 
SEu 86 Al 
o 6.90 3.29 
SE, 61 .29 


16.14* 


6.12* 
* Significant at .01 level. 


To test the null hypothesis that no real dif- 
ference exists between the R and RG situa- 
tion, ¢ tests for correlated samples were com- 
puted for the reaction times. The results are 
summarized in Table 1. Differences between 
means and differences between standard de- 
viations were significant. 

Mean response times for one through eight 
error dials in the R and RG situations are 
plotted in Fig. 2. To test the null hypothe- 


RED ONLY 


MEAN RESPONSE TIME secs 


4 5 6 


NO OF DEVIATING DIALS 


Fic. 2. Mean response times in R and RG situa- 
tions for configurations containing one through eight 
error dials 


sis that no real differences exist between the 
two situations, a ¢ test was computed for each 
number of error dials. All differences be- 
tween these mean response times were signifi- 
cant at the .O1 level. 

The total number of errors made in both 
situations were also recorded. Mean number 
of errors committed were 1.05 for the R situa- 
tion and 0.38 for the RG situation. 
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Discussion 


The results can be briefly summarized as 
follows: (a) In both situations, the mean re- 
sponse time increased as a function of the 
number of error dials. However, the mean 
response times were significantly reduced in 
the RG situation. (b) The number of errors 
increased as a function of the number of de- 
viating dials. In this case also, the RG situ- 
ation was superior, reducing the errors by 
almost 4. (c) Sex differences were not sig- 
nificant in this type of perceptual motor task. 

The first finding is what we might intui- 
tively expect. As the visual task becomes in- 
creasingly difficult, there is a comparable in- 
crease in search time under both situations. 
The difference between total response times 
under the R and RG situation is significant. 
Since the difference between standard devia- 
tions is also significant, the variability of re- 
sponse under the R situation must be much 
greater. Comments from Ss seemed to give 
the impression that under the RG situation, 
the color of the dial was used as the primary 
cue in spotting error dials while the pointer 
deflection was secondary. In the R _ situa- 


tion, only the pointer deflected, and thus was 


the only cue for check reading. 

In practical situations, the first objective of 
check reading is to spot a nonnormal indica- 
tion, and then a quantitative or direction of 
deviation reading is secured. If the accuracy 
of check reading depends on the attention 
value of the visual task, a pointer deflection 
plus a change from red to green illumination 
would be superior to pointer deflection alone. 

The second result is what also might be ex- 
pected. As the task became more difficult, 
more errors were committed. However, al- 
most three times as many errors were com- 
mitted under the R situation as under the 
RG situation. 

There were no sex differences found, a re- 
sult which is in agreement with other experi- 
ments in the perceptual motor area (5). 


Summary 


The present study was concerned with the 
effect of several characteristics of the visual 


‘Two situations were investigated. 


task upon speed and accuracy of visual search. 
In the R 
situation, from one through eight error dials 
were presented while the normal illumination 
of red was held constant. In the RG situa- 
tion, error dials were accompanied by a change 
in illuminant from red to green. It was found 
that: (a) Differences between response times 
and also variability of response under the R 
and RG situations were significant; (b) three 
times as many errors were committed under 
the R situation; (c) no significant sex differ- 
ences were present under either the R or RG 
situation. 

The results would indicate that the differ- 
ences in variability of response were due to 
the use of two different types of visual search: 
(a) check reading for pointer alignment only 
and (6) check reading with the aid of color 
cues. 


Received May 24, 1956. 
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The Reliability of Peer Nominations Under Various 
Conditions of Administration ' 


E. P. Hollander 


Carnegie Institute of Technology 


Recent years have seen an increasing ap- 
plied utilization of some form of peer-evalua- 
tion technique. In basic terms, this involves 
each group member’s assessment of his peers 
on some identifiable quality or complex of 
characteristics which is either directly observ- 
able, or indirectly inferrable, from personal 
interactions. Such individual assessments are 
then integrated into a composite score reflect- 
ing each person’s standing within his group. 
The advantage of this technique appears to 
reside in its ability to tap data drawn from 
intimate contact among personnel (2, p. 390). 
Furthermore, it evidently yields results. 

Peer-evaluation measures have been sub- 
jected to numerous validity studies which have 
established their predictive utility against vari- 
ous performance criteria (3, 6, 7, 8, 11, 12, 
13, 14, 15). Although these do lend support 
to broader application, it is nevertheless true 
that several questions remain to be answered 
regarding optimum employment. Notable 
among these is the matter of factors affecting 
reliability. 


Problem 


The particular instrument of attention in 
this study was the peer nomination, one of 
the more widely used of the peer-evaluation 
techniques. Three core variables relating to 
the reliability of peer-nomination scores were 
specified for consideration: first, the length of 
time a group must have spent together before 
scores will approximate maximized reliability ; 
second, the presence of any differential effects 
on reliability accruing from the use of forms 
with a “research” set as against those with a 
“real” set; and, finally, the variations in re- 


1 This study was supported by Contract onr 760(06) 


with the Office of Naval Research. The author is 
pleased to acknowledge the splendid cooperation of 
that agency and the U. S. Naval School, Officer 
Candidate. Opinions or conclusions contained in this 
report are those of the author; they are not to be 
construed as necessarily reflecting the view or the 
endorsement of the Navy Department. 
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liability which may be attributable to the na- 
ture of the quality on which the nominator is 
instructed to make nominations. 


Method 


Subjects and setting. The sample consisted of 23 
sections of officer candidates entering the U. S. Naval 
School, Officer Candidate, in Newport, Rhode Island, 
during July of 1955. The total N available exceeded 
709 at the beginning of this study. The sections 
numbered approximately thirty each; there is no 
reason to suppose that assignment to these sections 
was on anything other than a random basis 

The program at the OCS is of sixteen weeks’ dura- 
tion, with an orientation week introduced before the 
actual onset of the training cycle. During this one- 
week period, student personnel are assigned to sec- 
tions, receive books and clothing, take classification 
tests, receive orientation lectures, but do not attend 
formal classes, as such 

Except for a small minority drawn from the fleet— 
in this class numbering fewer than 5%—all of the 
students were graduates of four-year college pro- 
grams. The mean age of this class was 22 years 
with only a minimal dispersion about this figure. 
Students at the OCS are selected according to rigor- 
ous mental and physical standards. All are volun- 
teers and must agree to remain on active duty as 
officers for three years following the successful con- 
clusion of training. 

Instruments. The previous work of Suci, Vallance, 
and Glickman (9) established several points which 
bore upon the selection of instruments for this study. 
Their research indicates that “. . . ratings by peers 
based on either the behavior at OCS or projected, 
future behavior are equally reliable . . .” (9, p. 11); 
in addition, they report that “The technique which 
requires selection of the upper and lower segments 
appears to have as satisfactory reliability as any of 
the other tested techniques .. .” (9, p. 11); finally, 
they note that the order of presentation which yields 
the lowest correlation between forms is “future offi- 
cer” followed by “OCS behavior” (9, p. 12). 

Four sociometric forms of the peer-nomination va- 
riety were utilized. Based upon the research cited 
above (9), a primary form calling for nominations 
on “success as a future Naval Officer” (FO) was ad- 
ministered to all sections. This form was seen to be 
of particular worth in its likely prediction of more 
distant, fleet performance criteria. In addition to 
this form, each section received one of three other 
forms, ie., “leadership qualities’ (LQ), “interest in 
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and enthusiasm for the Naval Service” (IE),? and 
“probability of success in OCS” (OC). The selec- 
tion of these forms rested upon a need to tap those 
characteristics which might relate to both in-train- 
ing and post-training performance—ie., interpersonal 
qualities, motivation, and ability having evident 
relevance to OCS performance. 

Cutting across this pattern, approximately half the 
sections received a “research” set with the explicit 
point, appearing on their peer-nomination forms, 
that “The results of these ratings are to be used for 
research purposes only and will not affect your Navy 
career” (the so-called “RO” set). The other sections 
were given equally explicit instructions that “The re- 
sults of these ratings may be used for administrative 
purposes” (the “AU” set). This split in treatment 
was designed to provide data on any differential ef- 
fects attributable to administration under a “re- 
search” set, as against an “administrative” set, for 
all four forms. : 

In all there were eight possible treatments, ic., 
four characteristics to be rated times two sets. All 
of the forms administered required five “high” and 
five “low” nominations in order of preference. This 
follows a standard procedure described elsewhere (4). 
An alphabetical section roster was attached. 

Design. The 23 sections were divided into six 
blocks of four sections each, except for one block 
which, of necessity, was limited to three sections. 
Sections were assigned to blocks on a rotation basis 
from the five companies in the second battalion. 
Such differences as might exist between companies 
were thus restricted in their conceivable ability to 
contaminate the study design. 

Once having been assigned to a given block, the 
treatment of any given section was identical through 
training. Three major administrations of these forms 
were carried on during the training cycle: the first 
occurred during the so-called “orientation week” 
after the subjects had been together in their respec- 
tive sections for four to five days; the second at the 
end of the third week of training; and the third at 
the end of the sixth week of training. The design 
was replicated, therefore, a total of three times. At 
the end of the thirteenth week of training, another 
administration was made, but this last time only the 
FO form was used, In all other respects, the design 
was identical for the latter administration. 

Scoring. Following the pattern utilized in several 
studies elsewhere (cf. 2), a direct weighting pro- 
cedure was applied to derive peer-nomination scores. 
The highest nominee was awarded a + 5, the next 
highest a +4, and so on through the five “highs”; 
similarly, the lowest nominee was assigned a — 5, the 
next lowest a — 4, and so on. An algebraic sum was 
then obtained for each subject and divided by the N 
of the group —1, since no subject may nominate 
himself. This results in an average score ranging on 

2 This particular form was derived from a study 
by Webb and Hellander (11) on the prediction of 
voluntary withdrawal from flight training as a 
morale criterion. 


a continuum from + 5 to — 5. To remove the minus 
sign, a constant of 5 was added to this score; the 
resultant value was then multiplied by 10 in order 
to permit the use of a two-digit score without the 
intervening decimal point. Where a subject had an 
average raw value of — 5, and hence a score of 00, 
he was arbitrarily given a score of 01, after the con- 
stant had been added and the multiplication had 
taken place. At the other end of the range, the + 5 
subject was given a score of 99, rather than 100. 

The distribution arising from this procedure has 
normal characteristics with a range from 1 to 99, a 
mean of 50, and a standard deviation approximating 
10, for the total population of the study. While this 
score may be seen to have certain features of a stand- 
ard score, it does not tend to obscure section differ- 
ences so much as does the standard score. 


Results and Discussion 


Two major approaches may be followed in 
determining the reliability of peer-nomination 
data. In the first place, one may focus at- 
tention on the internal consistency of the 
nominations made within a given group at 
some discrete point in time. In the second 
place, one may deal with their consistency 
over time, as reflected in repeat administra- 
tion (cf. 1, p. 14). 

Of the two approaches, the former is more 
usually applied. The latter one has evident 
disadvantages in that time exposure is very 
likely to have an impact in altering the po- 
sition of subjects on the status continuum. 
This raises the question of whether, indeed, a 
“good” peer-nomination form ought to have 
high repeat reliability; in point of fact, one 
might wish to use peer nominations precisely 
for a study of temporal fluctuations in status, 
as well as the extent to which status is main- 
tained. Thus, a low-level correlation between 
scores yielded by two administrations of the 
same peer nomination form is very often evi- 
dence of an unstable group pattern, or un- 
stable individual behavior, rather than of an 
inherent unreliability attributable to the form 
itself. 

For our own purposes, we find both varie- 
ties of reliability of concern since we shall 
wish to know the internal consistency of scores 
obtained from various forms at various times, 
and the relationship over time of scores ob- 
tained from two administrations of the same 
form. Accordingly, both approaches were 


used in this study. 











Reliability of Peer Nominations 


Table 1 


Treatment by Forms 
and Sets 


“Future Officer” 
Research Set 


“Future Officer” 
Administrative Set 
“Interest and Enthusiasm” 


Research Set 


“Interest and Enthusiasm” 
Administrative Set 


“Success in OCS” 
Research Set 


“Success in OCS” 
Administrative Set 


“Leadership Qualities” 
Research Set 


‘Leadership Qualities” 
Administrative Set 


Orientation 


Corrected Split-Half Reliabilities of Eight Peer-Nomination Treatments at Various Stages of Training 


Week in Training 
Third Sixth 
Week Week 


Thirteenth 
Week 
97 97 97 
92 91 91 


97 97 95 
96 95 91 
96 96 

32 29 29 

88 89 94 

32 32 32 


93 .98 29 
32 31 31 


89 97 98 
32 32 31 


93 
32 32 31 


94 97 .98 
32 32 32 








* The number below each coefficient indicates the N upon which it is based. 


In practice, the calculation of single-stage 
reliability, or internal consistency, of a peer- 
nomination score is normally undertaken by 
an odd-even split of the raters within the 
group so as to afford two measures of status. 
The correlation between these measures is 
then treated by the Spearman-Brown formula 
to yield a corrected reliability coefficient. 

Table 1 reports the corrected split-half 
reliability coefficients calculated at various 
points in the life cycle of relevant groups for 
the eight peer-nomination scores. The re- 
duced Ns reported were based upon a ran- 
dom selection of representative sections re- 
ceiving the treatments involved. The identi- 
cal sections are studied at each time period 
so as to control section variations which might 
obscure time effects. There is good reason to 
believe from cross analyses that these sections 
are literally representative of their treatment 
block. 

It will be seen that the reliabilities in the 
first column, for all treatments, approximate 


.90—even though the sections had been com- 
prised only four or five days before. Omit- 
ting considerations of validity, it is striking 
to note the rapidity with which a group per- 
ception of individuals appears to have crystal- 
lized. This high reliability is of particular 
note when one considers that previous stud- 
ies, based upon peer-nomination scores drawn 
from later weeks of training, show rs which 
are not significantly greater (2). This is also 
reflected in our data here. 

The yield, as regards higher reliabilities, is 
greater, but not significantly so, as one pro- 
ceeds to later time periods. It would appear 
that the major increase occurs from the ori- 
entation week to the third week, after which 
the coefficients are stabilized. This is par- 
ticularly discernible in the case of the FO 
forms which were carried through to a thir- 
teenth-week administration. 

With respect to the reliability of compa- 
rable forms administered under a “research” 
(RO) as opposed to an “administrative” 
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(AU) set, no differences of a significant mag- 
nitude may be noted at any time level. Their 
respective patterns are practically identical. 

A contrast between the intercorrelation of 
scores for the “future officer” form, adminis- 
tered at various stages, under a research set 
as against an administrative set is to be seen 
in Table 2. In most respects the matrices are 
quite similar. Both indicate a sequential de- 
crease in correlation between the orientation- 
week scores and those scores obtained from 
later administrations; both are notable for the 
high correlation, i.e., .90 and .94 respectively, 
between scores derived from the third- and 
sixth-week administrations; and, in general, 
both reflect a stability of measure from the 
third-week administration onward. 

Tables 3, 4 and 5 follow on the pattern es- 
tablished in Table 2. In each case two mat- 
rices are provided—one for the research set 
and one for the administrative set—indicat- 
ing the intercorrelation of comparable forms 
administered at three time levels. Paired com- 
parison of comparable coefficients in the up- 
per and lower matrices of each of these four 
tables yields ¢ values which are not statisti- 


Table 2 


Intercorrelation of Peer-Nomination Scores for the 
“Future Officer” Form Administered Under 
Two Sets Independently at Four 
Stages of Training 


“Future Officer” —Research Set 


“3" Week “6 Week “13” Week 


“0” Week 74 61 53 
.78* 
“3” Week } R1 
“6” Week 88 
“13” Week - 
N throughout = 349 
“Future Officer’’—Administrative Set 
“6” Week “13” Week 


“3” Week 


“0” Week BP 65 56 
BI 
“3” Week 94 83 
“6” Week 
“13” Week 
N throughout = 320 


* These average rs were calculated from the triad by appli- 
cation of Fisher's 2 transformation. 


Table 3 


Intercorrelation of Peer-Nomination Scores for the 
“Interest and Enthusiasm” Form Administered 
Under Two Sets Independently at 

Three Stages of Training 


“Interest and Enthusiasm”—Research Set 





“3” Week 


“6” Week 





“0” Week 78 70 
.82* 
“3” Week 91 
N throughout = 119 





“Interest and Enthusiasm’”’——Administrative Set 


“3” Week 





“6” Week 

“0” Week 71 63 
.76* 

“3” Week 87 

N throughout = 116 


* These average rs were calculated from the triad by appli- 
cation of Fisher's 2 transformation. 


cally significant. A high relationship (about 
.90) is revealed between third- and sixth-week 
nomination scores for all forms. An average 
intercorrelation for each triad, using a z trans- 
formation of rs, was computed. These aver- 
age rs are presented in the center of each 
triad. A ¢ test of these reveals no significant 


Table 4 
Intercorrelation of Peer-Nomination Scores for the 
“Success in OCS” Form Administered Under 
Two Sets Independently at Three 
Stages of Training 





“Success in OCS’’—Research Set 





“a” Week “6” Week 


“0” Week 59 40 
.68* 
“3” Week 88 
N throughout = 112 


“Success in OCS’”—Administrative Set 








“3” Week 


“6” Week 


“0” Week 65 

74* 
“3” Week 
N throughout = 82 





* These average rs were calculated from the triad by appli- 
cation of Fisher's ¢ transformation. 
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Table 5 


Intercorrelation of Peer-Nomination Scores for the 
“Leadership Qualities” Form Administered 
Under Two Sets Independently at 
Three Stages of Training 

— - _ = r— 
“Leadership Qualities”——Research Set 


“3” Week “6” Week 


“0” Week 82 a7 


“3” Week 4 
N throughout 


“Leadership Qualities’’—Administrative Set 


“3” Week “6” Week 


“0” Week 85 .78 
88* 

“3” Week 95 

N throughout = 122 


* These average rs were calculated from the triad by appli- 
cation of Fisher's s transformation. 


difference between sets for any of the forms, 
including FO, where only its first three ad- 
ministrations were considered. 

Of all the forms, LQ evidences the highest 
average intercorrelation of the three scores. 
This is significantly greater for LQ-RO against 
OC-RO, and for LQ-AU against both OC-AU 
and IE-AU (p< .05). If any one form is 
to be considered to yield the most consistent 
score over time, then this is the LQ form. 
Since the evidence is somewhat spotty, and 
since consistency may have its limitations, it 
would not do to suggest that the LQ is per- 
force the “best” form. 

It should also be noted, in this regard, that 
the LQ scores have a substantially higher in- 
tercorrelation with FO scores at each time pe- 
riod than do either IE scores or OC scores 
(p< .01). From the magnitude of the rs 
involved (about .90), it would appear that 
the FO and LQ forms are being perceived as 
essentially similar. Regrettably, the design 
of the study did not permit a complete matrix 
of intercorrelations among forms, although we 
have attained some picture of this from va- 
lidity data reported elsewhere (5). 


Conclusions 


The data available from both varieties of 
reliability determination are essentially mu- 


tually supportive. The internal consistency 
of measure is high throughout the time se- 
quence. Early nominations manifest a sig- 
nificant relationship to later nominations, by 
the same groups, with the same forms. By 
the third week—and perhaps sooner, had we 
taken a sounding then—the nomination score 
is stabilized, at least insofar as its correla- 
tion with the sixth-week score is concerned. 

Of particular interest is the question of the 
eventual worth of the very early, i.e., orienta- 
tion week, ratings obtained here. While their 
relationship to later ratings is high, there is 
certainly not a one-to-one correspondence evi- 
dent. That this should be so is understand- 
able in view of the greater range of infor- 
mation available to the rater at later stages. 
Indeed, we may speculate that the essential 
virtue of the early rating is that it is based 
upon personal contact without the direct in- 
trusion of academic performance. considera- 
tions, which usually correlate appretiably with 
peer nominations in settings of this kind. 

In this regard it is of interest to note the 
high temporal stability of the leadership rat- 
ing, which tends to be founded in an “inter- 
personal quality” rather than an academically 
loaded performance characteristic. Again, 
what this means in validity terms is an open 
question for the time. It is probable, how- 
ever, that this has implications regarding the 
contribution of unique variance in the pre- 
diction of more ultimate criteria. Added to 
this picture, too, is the relatively lower aver- 
age intercorrelation of the ‘success in OCS” 
scores, which are very likely subject to a 
greater degree of immediately observable fluc- 
tuations in classroom performance. 

Since the “future officer’ score reveals an 
average intercorrelation with itself which is 
roughly intermediate for the range set by the 
other forms (LQ, OC, and IE), we may infer 
that it is sensitive to a broader range of im- 
pressions than the “leadership qualities” score, 
at one extreme, or the “success in OCS” score, 
at the other. This seems understandable in 
view of the extrapolation required to a dis- 
tant, and possibly diffuse, criterion. 

The differences obtained between the ad- 
ministrative and research sets are minimal as 
regards any gain in reliability using one as 
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opposed to the other. Among other points, 
it may be suggested that results already ob- 
tained from peer-nomination studies, where a 
research set was involved, may be taken to 
have “real life” implications. A caution must 
be introduced, however, lest premature con- 
clusions are drawn regarding the differential 
validity of forms administered under these 
two sets. 


Summary 


From research conducted with 23 trainee 
sections at the Naval OCS in Newport, data 
are presented relative to the reliability of 
peer-nomination scores as it is affected by 
three variables: the period of time the group 
has spent together; the nature of the set 
given, ie., “for research purposes” or “for 
administrative purposes”; and the quality or 
characteristic to be evaluated by the nomi- 
nator in making his nominations. 

It was found that the corrected split-half 
reliability of scores from forms administered 
very early in training, after the groups had 
been together for four to five days, was a rea- 
sonable approximation of the reliability ob- 
tained with the same forms and the same 
groups at later points in training. All forms 
showed a tendency to begin with substantial 
reliability and to rise in subsequent adminis- 
trations to only a slightly higher plateau. 

The peer-nomination scores obtained at the 
end of the third week of training correlated at 
about the .90 level with those scores obtained 
on the same groups at a later time, i.e., the 
sixth week. 

There was no significant difference in the 
single-stage reliability or longitudinal reli- 
ability of comparable forms administered un- 
der the “research” as against the ‘“adminis- 
trative” set. 

It was concluded that a peer nomination 
administered at the third week of training— 
or even earlier—will yield substantially the 
same information as that which is now ob- 
tained at considerably later points in time. 
Furthermore, it was noted that the “adminis- 
trative” set leads to neither more nor less re- 
liable scores than those secured through the 
presumably less threatening, or more lightly 
taken, “research”’ set. 


Received May 28, 1956. 
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Brayfield and Crockett (1) have provided 
a valuable service in their recent review of the 
literature on employee attitudes and em- 
ployee performance. Their well-substanti- 
ated conclusions as to the general absence of 
correlation between employee attitudes and 
such performance measures as productivity, 
absenteeism, accident rate, and job tenure are 
so out of keeping with commonly held as- 
sumptions that their paper will have a con- 
siderable impact upon the utilization of such 
attitude measures for both research and ad- 
ministrative purposes. In terms of the recent 
classification of validity concepts (4, 6), their 
findings point to an absence of both predic- 
tive and concurrent validity for morale sur- 
veys, for the typical purposes of their utiliza- 
tion. The present paper raises the question 
of the construct validity of morale surveys, 
and presents positive evidence of such va- 
lidity. Before giving these data, we would 
like to state briefly our understanding of con- 
struct validity, in terms somewhat different 
from Cronbach and Meehl (4), although in 
fundamental agreement with them. 

A given scientific construct has multiple po- 
tential operational specifications. If, as sam- 
pled, these operational specifications concur, 
the construct and the sampled measurement 
techniques have validity. Constructs for 
which diverse operational specifications per- 
sistently fail to agree are in the long run 
modified or abandoned. In the physical sci- 
ences, agreement among methodologically in- 
dependent operational specifications of the 
same construct may agree on the order of 
.99, if expressed as a correlation coefficient 
over a population of instances. In the psy- 
chology of individual and group differences, 


1 The two studies reported here were conducted in 
connection with the program of leadership studies of 
the Personnel Research Board, The Ohio State Uni- 
versity, Carroll L. Shartle, Director. 


confirmation coefficients as low as .5O may be 
comforting, even for something we measure as 
successfully and as usefully as “intelligence.” 

Construct validity thus becomes the corre- 
lation among two or more independent meas- 
ures as conceptually identical in their refer- 
ents as possible. But the distinction between 
reliability and validity is still a very impor- 
tant one to retain. Insofar as the measures 
share the same apparatus or the same ap- 
proach, they tend to share correlated error 
variance, or common variance due te f atures 
of the apparatus which are irrelevant to the 
construct in question. These problems are 
epitomized in the methodological literature 
under terms such as “apparatus factors” (7), 
“response sets” (3), and “halo effects” (5), 
all of which can result in the pseudo-confirma- 
tion of constructs. Reliability is epitomized 
by the correlation of two specifications of 
a construct through maximally similar ap- 
proaches. Construct validity is epitomized by 
the correlation between two or more specifica- 
tions of a construct maximally different in ap- 
paratus or method. The key problem is that 
of achieving an optimum difference in method, 
while at the same time retaining maximum 
identity in constructual referents. 

For the studies to be reported, the morale 
of a work-group may be taken as an average 
feeling of contentment or satisfaction about 
the major aspects of the work situation. The 
typical “morale ballot” or “employee attitude 
survey” attempts to measure this through 
anonymous voluntary self-description by the 
work-group members themselves. As a maxi- 
mally methodologically independent approach 
to the same attribute, we have taken measures 
of the morale reputation of the work-group 
with the members of other groups, in situa- 
tions where the work-groups have an oppor- 
tunity to learn about each other. 
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The first study (8) was conducted in the 
home offices of the Farm Bureau Insurance 
Companies in Columbus, Ohio. The person- 
nel of each of 17 sections were given a 60- 
item job-opinion questionnaire. At the end 
of the questionnaire, they were asked “In 
which sections in your department would you 
like best to work? Like best — Next 
best --——- Third best —-——.” They were 
tested one section at a time in the cafeteria, 
with complete anonymity and no supervisory 
personnel present. Each section was given a 
rank in terms of its over-all job satisfaction 
on the 60 items, and a rank in terms of the 
frequency with which it was mentioned by 
persons in other sections as a place where 
they would like to work. The rank-order cor- 
relation between these two measures was .65. 

In the second study * (2) a 30-item morale 
ballot was administered off-ship and under 
conditions of anonymity to the enlisted men 
of a squadron of 10 submarines operating out 
of New London, Connecticut. At the end of 
the ballot, they were asked to name 3 ships 
in answer to the question “Which ships in this 
squadron would you most like to be on for 
peacetime duty?” The ranking of the ships 
on the morale ballot total correlated .75 with 
the ranking of the ships in terms of the num- 
ber of mentions which they received from 
other ships. 

This method of securing other work-group 
reputations along with morale ballot self-de- 
scriptions seems to offer a simple check on 
construct validity which could easily be in- 
corporated into most employee attitude stud- 
ies. In these two known instances in which 
it has been applied, it has confirmed the 
construct validity of work-unit morale with 
values of .65 and .75. Note the egalitarian 
nature of this validation procedure. Neither 


2 This study was supported by the Office of Naval 
Research, under Contract N60ri-17 T. O. 111 NR 171 
123 with the Research Foundation, The Ohio State 
University. 


self-description nor reputation has been de- 
clared the criterion. Both can be recognized 
as fallible and biased. Yet to the extent that 
their biases can be assumed to be independ- 
ent, the two have validated each other and 
the construct to which they refer. 

These results in no wise contradict the find- 
ings of Brayfield and Crockett. Construct va- 
lidity can occur in the absence of predictive 
and concurrent validity. Indeed, in the first 
of these studies, Tyler (8) found that the job- 
satisfaction measure and the reputation meas- 
ure both showed correlations of .00 with ab- 
senteeism, in congruence with the general 
findings summarized by Brayfield and Crock- 
ett. In the submarine 4tudy, neither morale 
measure correlated significantly with re-enlist- 
ment rate or Naval records of ship efficiency, 
although the trend of the correlations was 
more promising, and the number of cases 
small. 


Received June 1, 1956. 
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While the human engineering literature 
provides an adequate account of the natu- 
ral, expected, or preferred direction of motion 
relationships between displays and controls 
mounted in the same plane, evidence about 
those moving in different planes is sparse and 
sometimes contradictory. Since it may be 
accepted that in many situations display- 
control relationships affect performance, there 
are several reasons for an attempt to supply 
the missing data. In the first place, different- 
plane relations are in fact in engineering use. 
Secondly, study of these may reveal general 
principles applicable to the whole field of dis- 
play-control relationships. Finally, display- 
control relationships have been used to vary 
task difficulty in, for instance, transfer of 
training experiments; and establishing the 
neutrality or otherwise of different-plane re- 
lationships may thus be of instrumental value 
for such experiments. 

An experiment to determine what display- 
control tendencies exist in the population at 
large should comply with the following re- 
quirements. Since the directions of motion 
used by an operator on a given occasion de- 
pend at least in part on his previous experi- 
ence or training, as wide a sample as possible 
should be tested. In addition, each S should 
be used once only to prevent transfer effects 
within the experiment. Further, it should not 
be assumed without proof that display-con- 
trol findings are “reversible”: an operator 
may turn a control knob clockwise to produce 
upward movement of a display, and the same 

operator may turn anticlockwise for down on 
a subsequent test—but it does not follow that 


1 The work reported here was begun at the Min- 
istry of Supply’s Clothing and Equipment Physio- 
logical Research Establishment, Farnborough, Eng- 
land. The author is indebted for statistical assist- 
ance to Mr. J. Draper of that establishment, and 
later to Mr. D. J. Newell of the University of Dur- 
ham. Professor R. C. Browne advised on the pres- 
entation of the report. 


he would have turned anticlockwise for down 
on initial presentation of the task. 

When a display pointer moves at right 
angles to the plane of rotation of a control 
knob, there appear to be five spatial com- 
binations which a right-handed operator may 
be called upon to use with his right hand, 
and two in which the left hand would be 
necessary. These combinations are shown 
in Fig. 1. The present investigation attempts 
to determine the direction of motion tend- 
encies for these seven combinations in the 
light of the above requirements. 


Method 


Apparatus. The apparatus was built largely of 
“meccano,” faced with white cardboard. The knob 
was of 2” diameter, and the scale was 7” long. 
Rotation of the knob in either direction “wound up” 
thread on a pulley, drawing the pointer toward the 
pulley. Thus a given pointer movement could be 
produced whichever way the S turned the knob. 
The pulley was situated at opposite ends of the ap- 
paratus in Series I and Series II (see Fig. 1). To 
present the seven combinations of planes, the entire 
apparatus was moved bodily to give the required 
effect; no further adjustment was necessary but for 
a retaining wire to hold the pointer against gravity 
in the vertical display combinations. 

Subjects. A total of 718 Ss were used sequen- 
tially, 214 being required in Series I, and 504 in 
Series II. The two series were carried out sepa- 
rately, but Ss were assigned randomly to combina- 
tions within series. The composition was as fol- 
lows: Series I; soldiers of five British regiments, 172; 
research workers, 19; supporting staff and clerical 
workers, 23; Series II; soldiers, 297; medical stu- 
dents, 163; research workers, 13; supporting staff, 
31. Eleven female Ss are included in the above, and 
were used as available, there being no evidence of a 
sex difference in direction of motion tendencies. 

Procedure. The experiment was carried out in 
two phases; in Series I the seven combinations were 
tested in which the pointer started from the end of 
the scale nearest the control and moved away, while 
in Series II the display movement was toward the 
control, 

Each S was used only once, and on one combina- 
tion. An S was asked his dominant hand (the writ- 
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Fic. 1. Display-control arrangements. 

ing hand being noted in case of doubt), but the 
hand used was that demanded by the combination 
being tested, numbers 6 and 7 being left-handed ar- 
rangements. The apparatus was presented between 
eye and elbow height, and the S was asked to set 
the pointer to an arbitrarily selected scale marking 
by means of the knob. In the few instances when 
E was asked “Which way does it go?” the reply 


“Twist the knob” was adequate to produce a re- 
sponse. Only the direction of rotation finally chosen 
was noted, any initial hesitation being discounted. 
Statistical treatment. The results were analyzed 
sequentially. Responses were entered on a separate 
graph for each combination. The graph was a trans- 
formed version of the plot of sample probability 
against sample size. The number of anticlockwise 
responses was substituted on the Y axis for the 
probability of an anticlockwise response inferred 
from David’s (2) tables of 95 per cent, and 99 per 
cent, confidence bands. Each graph gave a two- 
tailed decision whether any results were significantly 
different from chance, taken as one in two trials for 
this experiment. As a precaution against errors of 
the second kind (of assuming no difference when a 
difference does exist in the general population), 
“lines of nonsignificance’ were inserted. For this 
purpose it was assumed that if the population were 
split more evenly than 70/30 per cent in favor of 
clockwise or anticlockwise, the difference was not of 
importance for engineering design. The number of 
Ss required made it unrewarding to test for differ- 
ences closer to chance expectancy. Testing on any 
one combination was continued until either 1 per 
cent significance was passed in one direction, or 
until the plot entered the area of 95 per cent con- 
fidence that no difference wider than 70/30 existed. 


Results 


Clockwise turning was a significant tend- 
ency for all right-hand combinations in Series 
I; however, the tendency was not reversible, 
in that significant tendencies were also clock- 
wise for combinations la, 2a, and 4a in Se- 
ries II. In fact, over both series, 63.8 per 
cent of all responses were clockwise; for the 
right-hand combinations (1—5), 68.8 per cent 
were clockwise. A significant difference from 
chance did not appear in any of the left-hand 
combinations. Proportions of clockwise re- 
sponses (given relative to 10, not 100, for in- 
dividual combinations, as percentages would 
artificially inflate the numbers apparently 
used in many combinations), and totals, are 
shown in Table 1. Comparisons of the pro- 
portions of clockwise to anticlockwise re- 
sponses between Series I and II were made 
by tests of y*; the resulting estimates of sig- 
nificance are also included in Table 1. It 
will be noted that on the whole, significantly 

2 With the use of more advanced sequential tech- 
niques (stemming from Wald, A. Sequential Analy- 


sis. New York: Wiley, 1947), the significance levels 
shown in Table 1 remain the same in all combina- 
tions except 2b and 4b, which would fall between 
the 1% and 5% levels. 
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Table 1 








Series I 
Pointer Moving, From Knob 
Comb. Proportion/10 
No. Clockwise N Sig. 
la 10 18 1% 
2a 9.5 19 1% 
3a 91 23 1% 
4a 9.5 20 1% 
Sa 8.4 54 1% 
Right hand clockwise 86.6% 134 

6a 5 42 Not* 
7a 5 38 Not* 


Total clockwise 72% 214 


* Prob. (0.3 <p <0.7) > 0.95, 


more anticlockwise responses were made to 
Series II combinations than to those in Se- 
ries I. 

Left-handed Ss formed 8 per cent of the 
total population. Left-handers’ responses were 
extracted, and compared with right-handers’ 
by y* where size of sample permitted, or 
otherwise by direct calculation of relative 
probabilities. Left-handed Ss produced sig- 
nificantly (p < 1%) more anticlockwise re- 
sponses than right-handers in the left-hand 
combinations of Series II; and also signifi- 
cantly (p << < 1%) more anticlockwise re- 
sponses while using the right hand in Series I. 
Over all combinations, left-handedness was 
significantly (p < 5%) associated with anti- 
clockwise response. 


Discussion 


Two main hypotheses appear necessary to 
account for the results—a generalized clock- 
wise tendency, and a helical or screw-like 
tendency. For the first view, it is widely 
held that display-control combinations in dif- 
ferent planes are not “natural” with move- 
ment in either sense; and evidence already 
exists (5) that where relationships are in any 
way ambiguous there is a tendency for clock- 
wise response to predominate regardless of 
display movement. 


Total clockwise 59.9% 504 


Results for All Display-Control Combinations 


x? Com- 

Series IT parison 

Pointer Moving Toward Knob of Series 

Comb. Proportion/10 

No. Clockwise N Sig. Sig. 
1b 7.8 59 1% Not 
2b 6.7 79 % 5% 
3b 4.8 50 Not* % 
4b 6.3 152 1% 1% 
5b 49 45 Not* 5% 

R. H. c/wise 62.6% 385 R.H. 0.1% 
6b 4.8 52 Not* Not 
7b 5.4 67 Not* Not 


Totals 1% 


However, although the persistent clockwise 
stereotype throughout the five right-hand 
combinations shows no change of sense with 
change of absolute physical plane, the direc- 
tion of motion of display pointer is constant 
with regard to the control movement. Clock- 
wise rotation producing movement away from 
the point of rotation along the axis of rota- 
tion is what is normally expected of screws, 
bolts, vises, and other threaded devices. A 
helical analogy is thus as apt as a generalized 
clockwise hypothesis for Series I where both 
would predict the same outcome. Series II 
is crucial, since the helical principle would 
predict anticlockwise behavior, opposing the 
generalized clockwise principle. On the whole, 
it may be seen from Table 1 that the clock- 
wise principle dominates. However, in Se- 
ries II the clockwise proportions are reduced, 
and greater numbers of Ss are needed to ar- 
rive at decisions. The y* comparisons show 
that the number of anticlockwise responses is 
highly significantly increased for the right- 
hand combinations as a group in Series II, 
and all but the first pair are individually sig- 
nificant. It must therefore be concluded that 
a helical tendency is present, although over- 
laid to some extent by the generalized clock- 
wise stereotype. 

Combination 1 (see Fig. 1) only fails to 
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show significant differences between series as 
the numbers in Series I are very low; the 
tendency being unanimous, little testing was 
sequentially required. Warrick (5) provides 
evidence for a similar combination, using a 
rather different technique. In what corre- 
sponds to Series I, only 77 per cent of his Ss 
turned clockwise; although it should be borne 
in mind that this alone, with his sample size, 
is significantly different from chance at the 1 
per cent level. In what corresponds to Series 
II, 66 per cent of Warrick’s responses were 
clockwise. The same author found evidence 
of inconsistency for Combination 3. Salient 
differences, however, are that the light indi- 
cator moved from different starting positions, 
and Ss were apparently used more than once. 

Only Combination 3 is likely to be of prac- 
tical use. Number 3b in Series II has a high 
anticlockwise proportion and in fact extrac- 
tion of helical responses (i.e., clockwise in 
Series I added to anticlockwise in Series IT) 
shows 5 per cent significance against non- 
helical responses. Carter and Murray (1) 
have used the combination, but in a com- 
pound two-handed comparison with other re- 
lationships in pairs. 

The fact that the left-handed combinations, 
6 and 7, failed to show stereotypes in either 
series is probably linked to the tendency to 
symmetry in hand movements (4), and it 
must be assumed that a generalized clock- 
wise tendency is not applicable to those cases. 
It is hardly surprising that a greater anti- 
clockwise tendency should be present for left- 
handers using the left-hand combinations, but 
it is interesting to note that this is also the 
case for the right-hand combinations in Se- 
ries I, where it must be assumed that the 
dominant direction required by symmetry has 
been transferred to the nondominant member. 

The results from Series I and Series II are 
gathered from different individuals. When 
there is a significant population tendency to 
turn clockwise to produce movement of the 
pointer in both directions (as in Combination 
1), it cannot be inferred that any individual 
is likely to turn clockwise-for-toward, hav- 
ing just responded by clockwise-for-away. It 
can in fact on general grounds be assumed 
that consecutive responses will be compatible. 
What will be affected by the tendency will 


be responses to initial presentations, which 
might in practice occur with the pointer at 
either end, or discrete responses separated in 
time. 

It is necessary to justify inference from a 
division of responses in a population to a con- 
flict of tendencies in the individual. A divi- 
sion in the population of 6.7 clockwise to 3.3 
anticlockwise (as in Combination 2b) does 
not imply a conflict in an individual of habit 
strengths in the ratio 6.7:3.3; but it appears 
reasonable to assume that both tendencies are 
present in most individuals to some extent, 
rather than that the population consists of 
6.7 “blacks” and 3.3 “whites.” To the ex- 
tent that population tendencies are affected 
by a range of previous experiences, it does 
not seem likely that the same experiences 
which led the majority to their “decision” 
have left no residual tendency in the mi- 
nority, and vice versa. In any case there is 
direct evidence for competition of tendencies 
within the same individuals in a similar situ- 
ation, where Loveless (3) found rotary and 
linear tendencies existing together in indi- 
vidual performance on the bottom segment of 
a circular scale. It would be possible to test 
the existence of competing tendencies by 
other methods; by increase of response times 
on Series II, for example; or by the differ- 
ential resistance to interference with the 
dominant tendency of unanimous against di- 
vided “decisions”; or by differences in test- 
retest reliability, under favorable conditions. 

An incidental observation of interest in 
this connection is the number of Ss who ex- 
pressed certainty about the direction of mo- 
tion employed. Unfortunately no records 
were kept. In, discussing the experiment 
after taking part, many Ss showed surprise 
not only that a mechanism could work with 
opposite relations to the ones they had 
chosen, but that anyone should “naturally” 
choose an opposite relationship. “Have any 
subjects turned clockwise?” was a common 
question from anticlockwise responders; it 
was common to assume that the result was 
a foregone conclusion, in whichever direction 
the S had used. It was not noticed that 
there were any differences in certainty be- 
tween majority and minority groups. This 
phenomenon, if it is fact, does not neces- 
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sarily argue against the notion of conflicting 
tendencies within the individual, as it is fairly 
common in psychology to find report at vari- 
ance with performance. 


Summary 


The object of the experiment was to in- 
vestigate the direction of motion relationships 
for seven combinations of display pointer 
moving at right angles to plane of rotation of 
control knob. A total of 718 Ss were tested 
by sequential methods on an apparatus pro- 
ducing a single direction of movement of a 
pointer, moving along a linear scale, for 
either clockwise or anticlockwise rotation of 
the control. 

For those combinations where the right 
hand was used, there was a significant tend- 
ency to turn the knob clockwise to produce 
movement away from the knob. This result 
is not, however, reversible, in that in most 
cases there was also a significant tendency 
for movement towards the knob to be medi- 
ated by clockwise turning. Comparison of 
the proportion of clockwise and anticlockwise 
responses between series nevertheless shows 
significantly more anticlockwise responses for 
movement towards the control; which is com- 
patible with the hypothesis of interaction of 
(a) a generalized clockwise tendency, with 
(b) a helical, or screw-like tendency. Left- 


handed combinations gave rise to no signifi- 
cant tendencies; but left-handed Ss gave sig- 
nificantly more anticlockwise responses than 
right-handers, even when the right hand was 
used. 

On the whole it is not advisable to employ 
any of the combinations explored in this in- 
vestigation, unless movement is to be re- 
stricted to adjustments in one direction only 
relative to the control. 


Received June 6, 1956. 
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“We must learn to ascertain more precisely 
the distinctive marks of promise in agricul- 
tural pursuits so that potential talent or its 
lack may be appraised with a sureness no less 
warranted than that of the judges awarding 
blue ribbons to the best colts at the state fair. 
After intensive cultivation of this area of per- 
sonnel psychology, the profession of vocational 
guidance will be able to shoulder its modest 
share of the responsibility for forestalling the 
ruin of crop-land and banishing the bugaboo 
of world-wide regression toward an age of 
starvation and savagery” (2, p. 397). 

Thus Walter V. Bingham appraised the role 
in agriculture of personnel psychology and 
vocational guidance in a posthumously pub- 
lished paper titled “Who Should Farm.” 
“There is,” he said, “a possibility that the 
human race can substantially improve the av- 
erage competence of its farmers and conserve 
the productivity of their lands, not only 
through research and education in agricul- 
tural science, but also through providing ade- 
quate occupational guidance to farmers’ sons 
as well as to others who are drawn toward 
life on a farm” (2, p. 397). 

Unfortunately, personnel research on the 
characteristics of successful farmers has been 
extremely limited. There are almost no data 
on objectively measured attributes as a com- 
prehensive survey of the literature by Byram 
and Nelson (6) reveals. Results from the 
World War I and II mental test hierarchy 
studies place farmers and farm laborers in a 
low rank although there is a wide spread of 
scores among them and there is some question 
as to the representativeness of the samples. 
When Proctor (11) followed up the careers of 
persons tested 13 years earlier while in high 
school he found that those who had become 
farm managers were Classified in his Group IT 
which had an average IQ of 108. Data on 

1 We are indebted to Mr. Dillard Huffaker, Mr. 


Dale Thompson, and Mr. Charles Marsh for furnish- 
ing subjects and making this study possible. 
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special aptitudes are completely lacking in 
the personnel literature. 

To date, there have been only two pub- 
lished reports of the measured interests of 
farmers. Strong (12), of course, has devel- 
oped a key based on a sample of farmers, 
most of whom were graduates of agricultural 
colleges. The Kuder Manual (16) gives me- 
dian scores on the Kuder Preference Record, 
Vocational, for an undescribed sample of 107 
farmers; their profile peaks on the Outdoor 
scale with the other scores clustering around 
the norm group medians. 

Apparently there have been no studies of 
the measured personality characteristics of 
farmers. 

Although farming is basic to the American 
economy, it is obvious that it has been a neg- 
lected area of personnel research. The in- 
vestigation reported here is a preliminary at- 
tempt to provide useful data on the personal 
characteristics of a sample of young farmers 
and to relate status on these attributes to 
job performance and job satisfaction. In ad- 
dition, the design of this study permits an as- 
sessment of the relationship between job per- 
formance and job satisfaction. 


Procedure 


Subjects. The subjects of this investigation were 
members of three Veterans On-the-Farm Training 
classes in the program sponsored by the Federal gov- 
ernment after World War II. Full cooperation of 
the three instructors was obtained in making the 
subjects available and providing time for data col- 
lection during class periods. 

Two classes located in a farming community of 
717 population in the northeastern corner of Kansas 
had 19 and 20 members, respectively. A third class 
with 22 members was located 14 miles away in a 
similar community in the same county. The area is 
one of general farming with corn being the chief 
small grain and dairy cows and hogs being the chief 
livestock concerns. This geographic distribution con- 
tributed to the homogeneity of the group with re- 
spect to the nature of the farm operation. 

Fifty farmers completed all of the materials and 
constitute the sample studied. The average member 





Aptitudes, Interests, and Personalities of Farmers 99 


of the group was 33 years old, married and had one 
or two children (seven men were single). He was 
farm-reared, having spent an average of 25 years on 
a farm. His average length of school attendance was 
10.8 years. At the time of the study, he had been 
managing his own farm for about six and one-half 
years and was farming 228 acres which he rented. 
Seven of the men owned their farms and ten more 
owned part of their operation. The average mem- 
ber of the group had been in the training class for 
about two and one-half years. 

As supplemental background, it was found that 34 
of the 43 wives had spent 10 or more years on a 
farm. Only four of the 34 had encouraged their 
husbands to enter some other line of work accord- 
ing to the husband’s report. 

By and large, this was a stable, farm-raised group 
engaged in operating their own farms in a “good” 
general farming area in Kansas. 

Criterion measures. The Brayfield-Rothe Job Satis- 
faction Index (4) was administered to the subjects 
to obtain a measure of their attitude toward their 
job. Split-half, corrected reliabilities of .87, .78, and 
89 previously have been reported (4, 5). For this 
farmer sample, the corrected split-half reliability co- 
efficient was .60; if the three most inconsistent sub- 
jects were eliminated the corrected coefficient would 
become .77. It was found, incidentally, that the 
magnitude of the correlations reported in this study 
would not be appreciably different if they were com- 
puted, as reported, on 50 subjects or 47. Evidence 
for the validity of the Index has been reported 
earlier (4). 

It is difficult to develop a criterion of success in a 
job so varied as farming. Although net income fig- 
ures were collected, they had obvious limitations. 
For example, it was not possible to correct for such 
variables as size of farm or differential living stand- 
ards. An attempt to have the subjects rate each 
other, using a nominating technique, failed; the sub- 
jects simply did not want to make such perform- 
ance judgments. 

The best criterion available appeared to be in- 
structor ratings. The instructors were thorcughly 
familiar with the subjects’ performance since they 
supervised them “on the job.” The three instructors 
each were asked to rank-order the members of their 
own classes at the beginning of the testing. Each 
was instructed first to select the person whom he 
considered to be the “best all-around farmer” and 
then the person whom he considered to be the 
“poorest” or “least good,” then he selected the second 
best, the second poorest, working in from the ex- 
tremes of his distribution until he had ranked each 
member of his class. 

In order to combine the ratings for three classes, 
the rank-order ratings for each class were transmuted 
into normal curve “scores” by means of Hull’s pro- 
cedure (7). Then a single distribution was made for 
the entire sample. Reratings were made from two 
to six months later. The instructors did not know 
that they would be asked to rerate. The reratings 
were treated similarly to the original ratings 


The product-moment correlation between original 
and reratings gave a reliability coefficient of .86. 
Then the original and reratings were averaged to 
provide a criterion score for performance on the job. 

Measures of attributes. The Differential Aptitude 
Tests, Form A (1), provided the aptitude data. 
Only five of the eight tests in the battery were ad- 
ministered due mainly to time limitations; Spelling, 
Language Usage, and Clerical Speed and Accuracy 
were omitted. 

Vocational interests were appraised with the Kuder 
Preference Record, Form C (16). The Minnesota 
Multiphasic Personality Inventory (8) provided data 
on another set of characteristics. 

Data collection. The training classes met one eve- 
ning a week. Testing was carried out’ over a four- 
week period with approximately two hours devoted 
to testing at each class mecting with a 10-minute 
“break” between tests whenever possible. After the 
group testing was finished, complete data were avail- 
able for 42 persons. To bring the total to 50 sub- 
jects, eight more farmers who had missed only one 
or two tests were given them in their homes. 

The group testing conditions were relatively good 
and the conditions were uniform among the three 
classes. The majority of the subjects appeared to be 
well motivated. Only six persons appeared to be 
uncooperative and these were among the 11 elimi- 
nated because of incomplete data 

All of the materials were signed by the subjects 
Thus the possibility exists that there may have been 
some distortion of responses, particularly on the 
personality measure and the job satisfaction blank 
The data for the K and Lie scales of the MMPI, 
reported in the next section, may help in evaluating 
this possibility. The problem of anonymity has been 
discussed elsewhere recently by Brayfield and Crock- 
ett (3) with the tentative conclusion that there is 
no predictable effect as such. In this study, rapport 
was carefully nurtured at all stages. One of the in- 
vestigators was well known and accepted in the com- 
munities where the study was made; the net effect 
of identification, however, could not be accurately 
assessed. 


Results and Discussion 

Personal attributes. The first aim of this 
study was to describe in objective terms the 
aptitude, interest, and personality character- 
istics of the farmer sample. This we did by 
comparing the mean scores of the farmers on 
various measures with the mean scores of 
relevant available norm groups. The de- 
tailed results are reported in three tables.’ 


*To reduce printing costs three tables have been 
deposited with the American Documentation Insti- 
tute. Order Document No. 5083 from ADI Auxiliary 
Publications Project, Photoduplication Service, Li 
brary of Congress, Washington 25, D. C., remitting 
in advance $1.25 for microfilm or $1.25 for photo- 
copies. Make checks payable to Chief, Photodupli 
cation Service, Library of Congress 
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On the five DAT subtests, the farmers were 
compared with 2,100 twelfth grade boys se- 
lected as the most appropriate comparison 
group since the farmer sample would prob- 
ably approximate the same amount of educa- 
tion if the age differences were taken into ac- 
count. The percentile ranks of the farmers 
fall into this pattern: Verbal Reasoning, 30; 
Numerical Ability, 30; Abstract Reasoning, 
15; Space Relations, 30; Mechanical Reason- 
ing, 70. The mean differences all were sta- 
tistically significant at the 1% level except 
for Space Relations which reached the 5% 
level. 

As compared to twelfth grade students, 
then, the picture of the farm sample with re- 
spect to special aptitudes is one of mediocrity 
with but one exception. Any interpretation 
of these findings, other than that they are 
real differences, would be sheer speculation; 
“test-wiseness,” age differentials, experience, 
and the like, might be considered. For the 
vocational guidance of young persons, how- 
ever, we suggest that farmers such as are rep- 
resented by this sampling are likely to excel 
somewhat in mechanical reasoning and to be 
no more than average, at the most, in the 
other attributes studied. 

On the ten Kuder interest scales, the farm- 
ers were compared with a male sample de- 
scribed as a “group of 1000 telephone sub- 
scribers who responded to the invitation to 
fill out the Preference Record. These sub- 
scribers were in a stratified sample of 138 
cities and towns selected from the Postal 
Guide. They were chosen from all sections 
of the country” (16, p. 20). It includes only 
20 farmers and appears to be somewhat 
heavily weighted with higher level occupa- 
tions. However, the members of this norm 
group were considered to be “more similar in 
_ background to adults usually counseled than 
a cross section of the population would be” 
(16, p. 22). 

The farm sample percentile ranks followed 
this pattern: Outdoor, 86; Mechanical, 67; 
Computational, 44; Scientific, 45; Persuasive, 
37; Artistic, 45; Literary, 20; Musical, 40; 
Social Service, 53; Clerical, 54. The mean 


differences on the Outdoor, Mechanical, Per- 
suasive, and Literary scales are significant at 
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the 1% level; the Artistic and Musical mean 
differences are significant at the 5% level.* 

The same profile pattern is found when 
Kuder’s second set of male norms is used al- 
though the Outdoor and Mechanical ranks 
drop slightly (16). 

For vocational guidance purposes, we would 
stress Outdoor interests as typical of farmers 
with some secondary indication of Mechani- 
cal interests. 

For the MMPI, the comparison group is 
composed of adult male visitors to the Uni- 
versity of Minnesota hospitals who were 
chosen to represent a cross section of the 
Minnesota population (8). It probably has 
a more “rural” loading than either of the 
other two comparison groups. The T-score 
pattern for the farmers on the MMPI was: 
L, 51; F, 51; K, 56; Hs, 53; D, 51; Hy, 55; 
Pd, 53; Mf, 54; Pa, 50; Pt, 52; Sc, 50; Ma, 
50; Si, 52. The mean differences for K, Hy, 
and Mf are significant at the 1% level; Pd is 
significant at the 5% level.‘ However, all of 
the scores are within the normal range as in- 
terpreted in the Inventory Manual. 

Job performance. A second objective of 
this study was to determine the relationships, 
if any, between measured personal attributes 
and performance in farming as indicated by 
instructors’ ratings. ‘The findings are reported 
in Tables 1, 2, and 3. 

Although all of the DAT measures were 
positively related to performance ratings, only 


Table 1 


Correlations Between Differential Aptitude Test Scores 
and Job Performance and Job Satisfaction 
for 50 Farmers 


Performance 


Job 

Subtest Rating Satisfaction 
Verbal Reasoning 19 00 
Numerical Ability 36** 05 
Abstract Reasoning .23 — 06 
Space Relations 21 — 09 


Mechanical Reasoning 22 16 


** Significant at the 1% level. 
8 Dr. G. Frederic Kuder kindly furnished the means 
and standard deviations for the comparison group. 
4Dr. Starke R. Hathaway kindly furnished the 
means and standard deviations for the comparison 
group. 
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Table 2 
Correlations Between Kuder Preference Record Scores 
and Job Performance and Job Satisfaction 
for 50 Farmers 


Job 


Satisfaction 


Performance 
Rating 


Subtest 


Outdoor 05 13 

Mechanical .22 02 

Computational 10 

Scientific .22 

Persuasive .22 

Artistic 

Literary .28* 
Musical . 11 

Social Service 

Clerical 16 


Cmonaunt WN 


| 


* Significant at the 5% level 
** Significant at the 1°) level. 


the correlation between scores on the Nu- 
merical Ability subtest and ratings as shown 
in Table 1 is statistically significant. 

Among the interest measures, as reported 
in Table 2, only the Scientific scale is sig- 
nificantly correlated with performance rat- 
ings. Since positive relationships between in- 
terest and performance are only infrequently 
found (12, 13, 16), this result is of particular 
interest. It appears to lend weight to the 
assertion that modern farming is a scientific 
enterprise. Since the Numerical Ability test 
may be considered to be a test of quantita- 
tive ability, the picture of a scientific farmer 
gains credence. Since the instructors were 
technically trained in agriculture, it is pos- 
sible that their performance evaluations re- 
flected this emphasis. 

There were no significant obtained rela- 
tionships between performance and the per- 
sonality variables measured by the MMPI. 

Job satisfaction. The relationships, if any, 
between job satisfaction and measured per- 
sonal attributes were investigated with the 
Brayfield-Rothe Job Satisfaction Index serv- 
ing as the criterion. 

As indicated in Table 1, there were no sig- 
nificant relationships between aptitude and 
job satisfaction. This is one of the signifi- 
cant findings of this study since there is so 
little evidence bearing upon this question 


(13). 
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Literary interests were found to be sig- 
nificantly, although inversely, related to job 
satisfaction. As can be seen in Table 2, the 
magnitude of the relationship is low. It is of 
some interest that the Scientific scale corre- 
lated .22 with satisfaction but the relation- 
ship is not statistically significant. 

Perhaps the most interesting results of 
the entire investigation are to be found in 
Table 3 in the correlations reported between 
measured personality characteristics and job 
satisfaction. Four MMPI scales are signifi- 
cantly related to job satisfaction. Both De- 
pression and Social Introversion-Extroversion 
are negatively related to satisfaction. That 
is, there is some tendency for farmers whose 
outlook on life is dark and gloomy and who 
show signs of social withdrawal to be dissatis- 
fied with their jobs. These results are per- 
haps relevant to the findings of Wesley (15), 
Weitz (14), and Brayfield and Wells (5) 
that job satisfaction is related to general 
satisfaction among males. 

Investigators of employee attitudes have 
always been plagued by the indeterminancy 
of their results due to the possibility of 
“faking,” “fudging,”’ or some other variety of 
distortion of responses. The question of the 
effect of anonymity, as mentioned earlier, is 


Table 3 


Correlations Between Minnesota Multiphasic Person 
ality Inventory Scores and Job Performance 
and Job Satisfaction for 50 Farmers 


Performance Job 
Subtest Rating Satisfaction 
L 07 
I A7 13 
K 02 5° 
— 03 0A 
Depression —.20 33° 
Hysteria 07 00 
Psychopathic Deviate — 19 O05 
Masculinity-Femininity —.18 — 18 
Paranoia 14 03 
Psychasthenia 12 -.22 
Schizophrenia 10 
Hypomania 02 04 
Social I. E. .20 .28* 


F 37°* 


Hypochondriasis 


* Significant at the 5°), level 
** Significant at the 1° level 
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a case in point. In the circumstances of the 
present study, we did find that test-taking 
attitudes bore a low, positive relationship to 
scores on the job satisfaction blank. Spe- 
‘ cifically, scores on the Lie scale of the MMPI 
correlated .37 with scores on the job satisfac- 
tion blank and scores on the K scale corre- 
lated .35 with job satisfaction scores. Both 
correlations are significant at the 1% level. 
To the extent that the Z and K scales on the 
MMPI are valid measures of tendencies to 
distort responses we can assume that our job 
satisfaction blank results were slightly con- 
taminated with a “best foot forward” or de- 
fensive attitude (10). Whether or not this is 
a function of the requirement that respond- 
ents identify their blanks we cannot say. 
Hyman (9) has suggested that it is neces- 
sary to differentiate psychological from lit- 
eral anonymity; possibly the situation itself 
evokes a response set independent of the ef- 
fects of identification. Our data simply indi- 
cate the possible influence of such a set. 
Incidentally, if the obtained correlations were 
corrected for attenuation the effect would ap- 
pear to be somewhat more important. 

It was of some interest in this study to in- 
vestigate the relationship between perform- 


ance and job satisfaction and a preliminary 
report of the finding has been made in a 
comprehensive survey of the question (3). 
Among the farmer sample the two variables 
were correlated .115 which is not statistically 


significant. That is, those farmers ranked as 
the most proficient were not necessarily the 
most satisfied. The converse also holds; the 
most satisfied were not necessarily the most 
proficient. 

For vocational guidance purposes the fol- 
lowing picture emerges from our findings. 
Young but experienced farm operators may 
be characterized as being somewhat superior 
in mechanical reasoning involving practical 
objects and relationships. The better farmers 
among them are somewhat more likely to 
excel in the ability to handle quantitative 
materials. Typically, they prefer outdoor 
and, to a lesser extent, mechanical activities; 
they are somewhat more likely to be rated 
successful if they have scientific interests. 
The more satisfied farmers are likely to be 
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those who have an optimistic outlook on life, 
are socially outgoing, lack literary interests, 
and perhaps generally tend to “put their best 
foot forward” or be defensive in their self- 
perceptions. 


Summary 


A group of fifty young farmers had distinc- 
tive aptitude and interest test profiles. Their 
personality test pattern was within the nor- 
mal range. 

Numerical ability and scientific interest 
were found to be positively and significantly 
related to performance on the job. Literary 
interest was negatively but significantly re- 
lated to job satisfaction. The Depression and 
Social Introversion-Extroversion scales on the 
MMPI were negatively but significantly cor- 
related with job satisfaction. 

Test-taking attitudes had a positive and 
significant relationship to scores on the job 
satisfaction measure. 

Job satisfaction and job performance were 
uncorrelated. 

In the absence of more extensive data, 
these findings provide some guideposts for the 
vocational counseling of potential farmers. 


Received June 14, 1956. 
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An Examination of Visual Acuity and Depth Perception as 
a Function of Magnification * 


Edward C. Weiss 


U.S. Army Ordnance Corps Human Engineering Laboratory 
Aberdeen Proving Ground, Md. 


Despite the rapid advances in the develop- 
ment of radar and other electronic equipment, 
optical instruments retain a vital role as fire 
control devices. Magnification occupies a cen- 
tral role in ordnance optical equipment and, 
therefore, it is still appropriate to investigate 
its effectiveness as a visual aid to army gun- 
nery. 

Magnification is supposed to increase the 
distance over which a target can be seen, and 
increase the accuracy in sensing of rounds 
and differentiating targets in distance. In 
more general terminology it is supposed to 
improve visual acuity and depth perception. 
A War Department technical manual on 
stereoscopic range and height finding (8) 
concludes from the mathematical principles 
involved that magnification improves the pre- 
cision of stereoscopic range finding. In con- 
currence, an NDRC (7) report suggests that 
magnification be increased to the maximum 
permitted by practical considerations of equip- 
ment design in order to increase accuracy of 
tracking. However, Bartley (1, 2) has found 
a flattening effect produced by optical mag- 
nification or a reduction in three-dimensional 
qualities of an object as viewed by an ob- 
server. This finding would seem to be at 
cross purposes with a device such as a range 
finder whose existence is justified by its abil- 
ity to lend depth to the situation. 

A NAVORD (3) study found that range 
of vision, as implemented by hand-held in- 
struments, was not effectively increased with 
additional magnification beyond 6 power. 
The effectiveness of mounted instruments, 
however, was found to increase up to at 
least 20 power which was the highest power 
available. In this instance, however, the cri- 
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teria variables employed were not concerned 
with visual acuity or depth perception, per se. 

Holway et al. (5) found that stereo acuity, 
when expressed in angular units, became less 
effective as magnification was increased at 
any given range, but improved as the dis- 
tance was increased at any particular mag- 
nification. This, of course, implies that depth 
perception becomes pocrer with increases of 
magnification but improves with increased 
distance. However, the same results ex- 
pressed in per-cent units indicate that depth 
perception is independent of magnifying 
power and nearly, though not exactly, inde- 
pendent of range. 

The purpose of the present study was to 
determine whether or not magnification should 
be increased to the maximum possible for 
hand-held instruments, or whether an opti- 
mum existed beyond which the law of dimin- 
ishing returns was operative. 

The study was composed of two separate 
but related experiments. The first was con- 
cerned with effects of magnification on visual 
acuity and the second with the relationship of 
magnification to depth perception. 


Methods 
Subjects 


Twenty enlisted men from the 723rd Armored Bat- 
talion, 71st Division, located at Camp Irwin, Cali- 
fornia, served as subjects (Ss) for both phases of 
the study. All the men exhibited 20/20 vision un- 
corrected and they were all qualified operators of 
the various optical equipment associated with tanks. 


Apparatus 


Visual acuity. The apparatus employed in the 
visual acuity phase of the study was a modified 
Landolt Ring display. Ten black broken rings re- 
sembling the letter “C” in form were attached to 
a large white board of homogeneous surface. The 
rings were fastened independently to the board in a 
fashion that permitted their rotation through 360°, 
while tightening a winged nut enabled them to be 
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fixed so that the breach appeared at 0°, 90°, 180°, 
or 270°. 

The stroke and break were equal for any given 
ring, and these were calibrated so that visual angles 
from 5.0 min. to 0.1 min. were subtended at the eye 
of an observer 100 yd. from the target presentation. 

Depth perception. The depth perception phase of 
the study involved two white tombstone-shaped tar- 
gets 64 in. wide and 90 in. high. One of the targets 
was stationary and the other was mounted on the 
front of a jeep in such a fashion as to completely 
conceal the vehicle from observers at the viewing 
line. 

The tombstone shape was selected in order to 
minimize the cue of vernier acuity and it thus served 
as a partial control in isolating the dependent vari- 
able to judgments based on binocular disparity. 

The independent variable of magnification was ex- 
pressed by means of three pairs of binoculars and 
the naked eye for both phases of the study. The 
binoculars consisted of 6 X 30 and 7 X 50 American 
instruments and a 10 ¥ SO Japanese binocular. The 
naked eye or one-power condition was implemented 
by an eyeshield to restrict the field of view to an 
area approximating that of the binoculars. A fur- 
ther control for this treatment was a pair of me- 
dium density Calobor Sunglasses which was an at- 
tempt to equate for the coated lenses of the bin- 
oculars. 

In order to obtain the long flat range desired, both 
phases of the study were conducted at the Yuma 
test station, Yuma, Arizona. Therefore all the ex- 
perimental treatments were examined under desert 
conditions, The down-range direction was slightly 
east of north so that the shadow effect was mini- 
mized. The surface of the ground was typical of the 
locale and homogeneous from the line of sight for a 
distance of 2,000 yd. The data were gathered during 
the period from 0800 to 1630 daily. 


Procedure 


Visual acuity. The Landolt Ring display was 
placed at a range of 100 yd. from the observation 
line and in a position normal to the observation line. 
Four Ss were run simultaneously, each man using a 
different power of magnification. The three bin- 
oculars and the eyepiece associated with the naked- 
eye (one-power) condition were rotated among the 
four Ss according to a previously determined ran- 
domized plan until all of the 20 Ss had viewed the 
target with each level of magnification 

The Ss were posted at the observation line with a 
separation of two yd. between men. This arrange- 
ment was symmetrical to a center line which bisected 
the Landolt Ring target. 

The men were instructed in the use of binoculars 
according to the Navy Lookout Manual (4) and 
given the opportunity to focus and adjust the in- 
struments. They were then told to stand at the 
viewing line with their backs to the target. At the 
command “Turn” they were instructed to face ‘he 
target and, starting with the large ring in the upper 
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left-hand corner, to report the position of the gaps 
in the Landolt Rings in terms of up, down, left, or 
right. The visual acuity data were then obtained 
on the Landolt Ring target for each subject under 
all four powers of magnification. The order of usage 
of binoculars as well as the order of target presenta- 
tion was completely randomized. 

Depth perception. The independent variables in 
this phase of the study were magnification and 
range. The same three binoculars and eyepiece in- 
strumented the conditions of magnification as in the 
visual acuity phase of the study. The ranges in- 
volved were 200, 400, 800, and 1,600 yd. 

The present investigation of depth perception em- 
ployed the Method of Constant Stimuli instead of 
the more usual Method of Limits in order to avoid 
the cue of movement and to allow four Ss to be 
tested simultaneously. This involved placing the sta- 
tionary target at either 200, 400, 800, or 1,600 yd. 
and having the variable, or jeep-mounted target, 
move to one of eight positions—four in front and 
four in back of the stationary target—until it had 
been located at each of these positions ten times for 
a total of eighty observations. The angular separa- 
tion between the targets was maintained at a con- 
stant of 5 min. of visual arc. 

It was necessary to determine the variable stimu- 
lus positions on some a priori basis, rather than on 
an empirical basis, and since ten per cent of each 
range involved appeared to be compatible with the 
accuracy standards of tank and field artillery gun- 
nery, it was selected as the maximum limit on each 
side of the stationary target. Thus, at 200 yd. this 
allowed for 5 yd. between the variable stimulus po- 
sitions. 

Due to the parallax involved, groups of four Ss 
were aligned with regard to the line which bisected 
the angle of the standard and variable targets, rather 
than along the viewing line normal to the targets 
It was assumed that this longitudinal separation 
would have no effect at the distances involved. 

The Ss were instructed to face away from the 
down-range position and at the command “Turn” to 
face the targets and judge whether the variable tar- 
get was nearer to them or farther away than the 
stationary target. When all four Ss had completed 
their judgments, the jeep was contacted by radio 
and told to proceed to the next position of the ran- 
domized plan. The various powers of magnification 
were randomized by Ss at each range. Four ran- 
domized plans were also provided for the movement 
of the variable target. Thus each S in any group of 
four viewed the targets according to a different plan 
of presentation with each of the four magnifications. 
Eighty observations, by each of twenty Ss, at each 
of four powers of magnification, at each of four 
ranges, yielded a total of 320 measures 


Results 
Visual Acuity 


The raw data were recorded in terms of up, 
down, left, or right with respect to the posi- 
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tion of the gap in the various Landolt Rings. 
These measures were scored against the ran- 
dom plan which governed the particular pres-' 
entation in the following manner. The S was 
given credit for all correct responses until two 
successive errors occurred, in which instance 
the visual angle, in minutes of arc of the last 
ring correctly observed, yielded his acuity 
score. In the Landolt Ring procedure a sub- 
tended arc of one minute of visual angle is 
equivalent to the more familiar Snellen rating 
of 20/20 vision. 

The data were examined according to a re- 
peated-measurements design as described by 
Lindquist (6). A Bartlett’s test for homo- 
geneity of variance was computed for the 
magnification by subject interaction and re- 
sulted in a corrected chi square of 85.258. 
Therefore, the assumption of homogeneity of 
variance was rejected. Lindquist (6, p. 86) 
suggests that an analysis of variance can be 
computed in the event of heterogeneous vari- 
ances by raising the “apparent” level of sig- 
nificance. Table 1 presents a summary of 
the visual acuity and Table 2 the analysis of 
variance. 

These results indicate a statistically signifi- 
cant difference between magnifications which 
the reader may be reluctant to accept as a 


Table 1 
Summary of Visual Acuity Data in Minutes of 
Arc for 20 Subjects 


1x Ox 


69 
345 


7x 10 


6.2 
310 


7.5 
375 


15,2 
700 





Table 2 


Analysis of Variance of Visual Acuity 
Data Including 1 








Source of 
Variation 


Sum of 
Squares 


Mean 
df Square F 





Magnifications (M) 
Subjects (S) 
MxS 


2.65 3 88 
3.94 19 21 
9.05 57 16 


5.5** 
1.3 


Total 


15.64 


** Probability less than .01, 
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Table 3 


Analysis of Variance of Visual Acuity 
Data Without 1X 


Source of 
Variation 


Mean 
df Square F 


Sum of 
Squares 


0.04 2 
0.61 19 
0.90 38 





0.02 
0.03 
0.02 


Magnifications (M) 
Subjects (S) 
MXS 


1.00 
1.50 


Total wa 59 





significant mean difference because of the 
heterogeneity which was found to obtain from 
the examination of the data by the Bartlett’s 
test. 

Since it was obvious that the heterogeneity 
of variances was associated with the 1 x or 
naked-eye condition, it was decided to ana- 
lyze the three powers of magnification with 
the naked-eye condition excluded. A Bart- 
lett’s test was computed under these condi- 
tions for the magnification by subject inter- 
action. The assumption of homogeneity of 
variance was accepted and the analysis is pre- 
sented in Table 3. In this instance there was 
no significant difference between magnifica- 
tions. 


Depth Perception 


The raw data consisted of nearer and far- 
ther responses in regard to the relationship of 
the variable target to the stationary target. 
In reference to the eight variable target po- 
sitions previously described, a threshold value 
of 5.00 represents perfect depth discrimina- 
tion at any of the four ranges. Thus, the 
thresholds presented represent the minimal 
depth which is perceptible 50% of the time, 
or the mean stimulus threshold (RL) (9). 
The thresholds were derived arithmetically, 
rather than graphically, by the method of av- 
erage Z scores as described by Woodworth 
and Schlossberg (9, p. 205). 

These data were also examined according 
to a repeated-measurements design. The 
absolute thresholds, or points of subjective 
equality, were computed in terms of the vari- 
able target positions rather than in yards. 

Table 4 presents a summary of the depth 
perception data, and the analysis of variance 
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Table 4 


Summary of Depth Perception Data Expressed as 


Absolute Thresholds in Terms of Variable Target 


Positions for 20 Subjects at 4 Ranges with 4 Levels of Magnification 


Range 200 Yards 400 Yards 
1x 


5.47 


Ox 
5.47 


Magnification 1X 6X 7X 10X 
‘ 4.49 4.38 4.42 4.28 


7X 


Xx 4.82 


is presented in Table 5. The assumption of 
homogeneity of variance was tenable for the 
range by subject and magnification by sub- 
ject interaction. The range by magnification 
interaction was not significant. There was 
also no significant difference between magnifi- 
cations. However, there was a significant dif- 
ference between ranges and between subjects. 
Upon further examination of the data, the 
difference between ranges appeared to be due 
to a sign change accompanying the 1 * and 
6 X magnification at the 200-yard range 
rather than because of the magnitude of the 
difference. Therefore, the findings appear 
compatible with those of Holway (5) who 
found that the error expressed in per-cent 
units was independent of the magnifying 
power and nearly independent of range. 

It was deemed advisable to utilize the vari- 
ance as a dependent variable and thus ex- 
amine the sensitivity of the thresholds. The 
Bartlett’s test performed on the range by sub- 
ject interaction yielded a corrected chi square 
of 173.178, while the corrected chi square ob- 


Table 5 


Analysis of Variance of Depth Perception Data 
Expressed as Absolute Limens in Terms 
of Variable Target Positions 


Mean 
Square I 


Sum of 
Squares 


Source of 
Variation df 
3.14 3 
40.99 3 
43.58 19 
8.90 C 
61.90 5 
62.29 5 
104.58 17 


Magnifications (M) 
Ranges (R) 
Subjects (5) 
MXR 

MxXS 

RXS 
MXRXS 


1.05 t 
3.46. 1255" 
2.29 3.7§°* 
0.99 1.62 
1.09 1.79°* 
1.09 1.79** 


0.61 
Total 325.38 


+ F less than 1.00 
™p «< 


800 Yards 1600 Yards 


10X 
4.51 


1x 6X 7x 10 
4.91 4.74 4.52 4.59 


1x 6X 7K 10K 
4.08 4.06 4.33 4.30 


tained for the magnification by subject inter- 
action was 112.174. Since these chi squares 
were associated with but three degrees of 
freedom, they were significant beyond the .O1 
level. Since the means and variances were 
not proportional it was decided to examine 
the data graphically rather than perform a 
normalizing transformation. 

Figure 1 presents the mean thresholds and 
variances for the twenty Ss plotted as a func- 
tion of magnification. Figure 2 presents these 
same variables as a function of range, while 





THRESHOLDS = (RL 


maa VARIANCES (5°) 

















Fic. 1. Absolute thresholds and variances ex 
pressed in terms of variable target positions as a 
function of magnification 
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Fic. 2. Absolute thresholds and variances ex- 
pressed in terms of variable target positions as a 
function of range. 


Figure 3 presents the information as a func- 
tion of magnification and range. Figure 3 
clearly shows the interaction between range 
and magnification. Thus, with increases in 
range the variance increases, i.e., the sensi- 
tivity decreases with the naked eye and 6- 
power binoculars to a much greater extent 
than occurs with the 7- and 10-power bin- 
oculars. 


Discussion 
Visual Acuity 


Visual acuity at a range of 100 yd. ex- 
amined under conditions of high ambient 
temperatures and desert terrain with a Land- 
olt Ring presentation was significantly bet- 
ter with binoculars than with the naked eye. 
However, within the aforementioned condi- 
tions the 6 x 30, 7 x 50 and 10 X 50 bin- 
oculars did not demonstrate any significant 
difference. The fact that there was no sig- 
nificant difference between Ss lends some 
rather indirect support to the efficacy of em- 


ploying the Bausch and Lomb orthorater,? 
on which the Ss were screened, as a selection 
device for field studies of visual acuity. 


Depth Perception 


Performance in regard to judgments of 
depth, when made under desert conditions 
and expressed in terms of a percentage of the 
distance involved, is independent of magnifi- 
cation and nearly, although not exactly, in- 
dependent of range. The sensitivity of such 
judgments, however, decreases sharply at ex- 
treme distances and this decrease is more se- 
vere for magnification of 1 x and 6 x than 
for 7 x and 10x. 

From the standpoint of methodology, the 
efficiency of employing the Method of Con- 
stant Stimuli in a field study of depth per- 
ception was fully justified. The results ap- 





——— THRESHOLDS (RL) 
= ——— VARIANCES (52) 








fhcadtriie 


Fic. 3. Absolute thresholds and variances ex- 
pressed in terms of variable target positions as a 
function of magnification and range. 
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* Bausch and Lomb Optical Company, Rochester, 
New York. 
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proximate those obtained by other methods 
in terms of accuracy, and in addition, there 
affords the advantage of recording more than 
one § at a time. 

The findings do not substantiate the predic- 
tions of geometric optics in that performance 
and magnification do not exhibit a linear re- 
lationship. The performance under all con- 
ditions of magnification and at all ranges is 
consistent with previous empirical findings in 
that it was surprisingly better than expected. 


Summary 


The purpose of the present investigation 
was to examine the effectiveness of magnifi- 
cation as a visual aid to ordnance optics. It 
was conducted under conditions of desert ter- 
rain and high ambient temperatures. The na- 
ture of the dependent variables involved ef- 
fectively dichotomized the study, one phase 
dealing with visual acuity and the other with 
measures of depth perception. 

While extrapolations and generalizations be- 
yond the boundaries imposed by the study 
must await future research it may be tenta- 
tively recommended that increases in magni- 
fication beyond 7 * should not be consid- 
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ered for incorporation in hand-held instru- 
ments. 


Received July 12, 1956. 
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Human Relations Behavior On The Job: 
The Officer Behavior Description 


Aaron J. Spector ' 


Officer Education Research Laboratory, Maxwell Air Force Base, Alabama 


The major objective of the human relations 
training program at Air University is modifi- 
cation of attitudes of officers. Accordingly, 
evaluation of the effectiveness of this part 
of the curriculum required psychological in- 
struments capable of reliably measuring atti- 
tudes. The Attitudes Test in Human Rela- 
tions (ATHURE), a forced-choice test, was 
developed for this purpose (8). Since a 
criterion instrument for the ATHURE was 
needed, the Officer Behavior Description 
(OBD), an objective description of human 
relations behavior “on the job,” was devel- 
oped. This step was undertaken because ex- 
isting criterion instruments were not consid- 
ered satisfactory. Such tests as are currently 
used were either standardized on the wrong 
population (2) for our needs, or else they 
measured the respondent’s approval of given 
behaviors (1) rather than his performance of 
them. Inasmuch as attitudes were to be in- 
ferred from the behaviors, it was imperative 
that the behaviors be measured, or witnessed, 
“on the job.” 


Content of the Test 


Selection of Behaviors for Item Construction 


Fortunately for this undertaking, the preliminary 
work was accomplished in a critical-incident study 
conducted by the American Institute for Research 
(AIR), under contract with the Human Resources 
Research Institute (4). 

AIR interviewed 640 officers, 90% of whom were 
majors or higher. A total of 3,029 critical incidents 
of effective and ineffective behaviors was obtained 
from these interviews and classified within 58 gen- 
eral categories of behavior. Eighteen of these cate- 
gories were judged by the present author to be re- 
lated to human relations rather than to technical 
ability, protocol, and management. The categories 
borrowed from AIR’s study provided the frame- 
work for establishing general areas, each of which 
would contain several items. Illustrative of the areas 
are the following: methods he used to improve mo- 
tivation of subordinates, handling of subordinates’ 


'Now at U. S. Naval Personnel Research Field 
Activity, Washington, D. C. 


suggestions, interpersonal relations with peers, reac- 
tions to superior’s suggestions, etc. 

Materials for the actual items were obtained by 
three methods: (a4) Open-ended interviews were con- 
ducted with majors and lieutenant colonels attending 
the Field Officers Course at the Air University; the 
interviewers probed if the subjects did not provide 
information about the 18 categories. (b) A ques- 
tionnaire was administered to other field grade offi- 
cers asking them to describe another officer’s behav- 
iors on each of the categories. (c) Items were taken 
from the AIR report and from other studies con- 
ducted for the Air Force, e.g., Ruch (5). 

On the basis of the materials thus collected, 105 
descriptions of effective and ineffective human rela- 
tions behaviors were collated within 19 major areas. 
The areas did not have an equal number of items. 


Format of the Instrument 

The various behaviors in each of the areas were 
objectively described, not evaluated, and the rater 
checked the appropriate position on a frequency of 
performance scale extending from “never” to “al- 
ways.” This was designed to reduce halo and leni- 
ency. A typical item selected at random from the 
test reads as follows: When he and another officer 
were given a job to do, how did he function with 
the other officer? “Did as little of the work as he 
could get away with.” This item, as all others, was 
checked on the “never” to “always” scale. The 
evaluation of each of the alternative responses was 
made by a group of judges, as explained below 


Test Items 
Evaluation of Responses 


When the 105 items, each with its own 
scale, were assembled into a booklet, a group 
of eight colonels attending the Air War Col- 
lege at the Air University was asked to evalu- 
ate each of the responses. Weights were as- 
signed according to their judgments of the 
importance of each behavior for human rela- 
tions in the Air Force. They used a maxi- 
mum range of nine points for each item, — 4 
through +4 (although not all items were 
assigned maximum scores and the sign of 
weights, positive or negative, sometimes 
changed at the end of a scale). 
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Table 1 


Comparison of a Group of Colonels Used as Judges, 
vs. All Colonels in the Air Force on Four 
Demographic Variables 


Sample 


Population 
Regular A.F. 87° 86°; 
Rated (Flyers) 75% 61% 
Bachelor degree or higher 75% 75% 
Mean Age 40 42 
N 8 Classified 


Reliability of the Judges 


A correlation of agreement among the eight 
judges was obtained with the Horst (3) for- 
mula. The obtained r (.94) is satisfactory 
for the present purposes, and indicates that 
opinions about the importance of these hu- 
man relations behaviors are quite consistently 
shared by all the colonels. 

Table 1 suggests that our sample is not un- 
like the population of Air Force colonels on 
the variables indicated. 


Item Analysis 


The first form of the OBD was adminis- 
tered to 263 students in the Academic Instruc- 
tors Course at the Air University. They were 
instructed to think of the peer officer with 
whom they worked the closest at their last as- 
signment, and to describe his behaviors on 
the questionnaire. Reliability of responses, 
as ascertained by the Kuder-Richardson for- 
mula No. 21, was .97. 

The items were then scored in three ways, 
all based on the weights assigned by the Air 
War College judges: (a) 1~9, which entailed 
adding a constant of 5 to the judges’ weights; 
(b) 1—5, a weight of 1 was assigned to the 
poorest end of the frequency of occurrence 
scale and increased linearly to the opposite 
end of the scale, with 3 being assigned to the 
“don’t know” response—e.g., if “never” had 
been weighted negatively it was then assigned 
a weight of 1, “occasionally” was weighted 2, 
“don’t know” was 3, “frequently” 4, and “al- 
ways” 5; (c) O-l, the zero was assigned to 
all frequencies which had previously been 
weighted negatively, the 1 to all positively 
weighted frequencies. The intercorrelations 
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of results with the three scoring methods were 
.98 or .99. 

Item validities were determined by two 
methods: (a) the Pearson, and (b) a method 
which has been devised by the education 
evaluators at Air University.°. The latter is 
an approximation of the Pearson, It was ob- 
tained by scoring the items 0-1, and sorting 
the total test scores into five equal groups in 
descending order of scores. The r is obtained 
by the following formula: 


5(I — V) — 2(11 — IV) 
V8(R-W) 


where: 


I, If, IV, and V the number of people in 
each group who answered the item correctly. 

R = total number right 

W = total number wrong 


The item validities of the two methods cor- 
related .82. The AU method is the most eco- 
nomical of time since a standard table of 
divisors is easily prepared and each item’s r 
can be obtained in a simple machine opera- 
tion. ; 

This method of item analysis was used in 
the development of the present instruments; 
74 items within 17 areas were judged to con- 
tribute sufficiently to the total test score to 
warrant inclusion in the final form of the test. 
One area was dropped because of insufficient 
items remaining in it, leaving 17 areas in the 
final form. 


Ratees 


Four hundred and ninety-four Air Force 
majors and lieutenant colonels attending the 
Air Command and Staff School served as sub- 
jects for the OBD as well.as for the Attitudes 
Test in Human Relations (7). 


Raters 


The raters were a subordinate, a peer, and 
a superior officer who had served with each 
subject immediately prior to the latter’s trans- 
fer to the Air University. The raters were 
advised that the data were to be used only 
2The item analyses were run by Dr. Kenneth 


Groves, Chief, Educational Evaluation Branch, Air 
University 
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for research and would not appear on any 
official records of the subject, nor would they 
affect him in any way. 


Weighting Classes of Raters 


One of the shortcomings of multiple rating 
procedures is that all raters do not usually 
have equal opportunities to observe the per- 
son they are rating. This error is magnified 
when raters are asked to make ratings or de- 
scriptions of characteristics which they can- 
not observe directly but must learn about from 
secondary sources. A differential weighting 
of raters’ responses according to their oppor- 
tunities to make the necessary observations 
might help to reduce some of the error. 

Accordingly, 22 majors and lieutenant colo- 
nels at Maxwell Air Force Base were asked 
to make judgments of the “observation op- 
portunities” of the three classes of raters on 
each item. They assigned weights from 1 to 
10, higher numbers signifying better observa- 
tion opportunities. The weights for all items 
in an area were averaged for each class of 
rater; thus, three weights for each of the 17 
areas were determined. Most of the areas 
were weighted 3 for subordinates, 2 for peers, 
and 1 for superiors. 

Agreement among the 22 judges was con- 
sistently high on all classes of raters. The 
Horst rs on subordinates, peers, and superiors 
were .93, .85, and .86 respectively. 

However, differential weighting by area was 
not as parsimonious as weighting the total 
score. The scores obtained by the two pro- 
cedures correlated .97, which allowed the sim- 
pler procedure to be used. 


The Mail Questionnaire 


Three questionnaires and appropriate in- 
structions were sent to the Commanding Offi- 
cer at the last duty station of each subject; 
the CO was requested to have the raters com- 
plete the questionnaires and mail them to this 
Laboratory. Thirty days after the initial 
mailing, follow-up letters were sent where 
necessary. The termination date for receipt 
of questionnaires was established as 90 days 
after the initial mailing. Complete returns 


from all three raters were obtained on 282 
subjects. 
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Table 2 


Intercorrelations of Unweighted OBD Scores Among 
Subordinate, Peer, and Superior Officers 





Subordinate Peer Superior 
Subordinate .13* i 
Peer a 
Superior 


* Significant at the .05 level. 
** Significant at the .01 level. 


Inter-Rater Agreement on the OBD 


On the basis of the judgments of observa- 
tion opportunities just described, one would 
not expect the three raters to be in substan- 
tial agreement on their descriptions of the 
subjects. Although the correlations in Table 2 
probably are not chance occurrences, the 
agreement among the raters is far from sub- 
stantial. This might be expected in the light 
of the differing opportunities for observation. 


Statistical Characteristics of the OBD 
The Leniency Effect 

The data obtained on the OBD indicate 
that the procedure of separating the descrip- 
tion from the evaluation did not entirely 
eliminate the leniency error; the distribution 
is markedly skewed to the left, much as are 
the curves obtained with traditional rating 
procedures. The raters in this study may 
have applied their personal evaluation schema 
to the questionnaire items (which would be 
expected to be similar to the one established 
by the group of colonels), and “evaluated”’ 
the subjects rather than described them. It 
may not be possible for untrained raters to 
objectively describe other people by main- 
taining a dispassionate view and suppressing 
their anxieties about the consequences of 
their ratings on the promotional opportuni- 
ties of the ratee. It was not feasible in this 
study to use the forced normalization or the 
forced-choice technique, both of which are 
reputed to reduce this error (6, 9). 


Reliability 


Completed questionnaires from all three 
raters were returned for only 57% of the 
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population. Accordingly, computation of re- 
liability by the usual inter-rater agreement 
was not practicable. The significant, albeit 
low, correlations between classes of raters 
(Table 2) is suggestive of some agreement, 
or reliability. Bearing in mind the fact that 
the classes of raters did not have equal op- 
portunities to observe the on-the-job behav- 
iors of the subject officers, it is not unrea- 
sonable to expect the rs to increase if the 
observation opportunities are equalized. Since 
this would be the case with multiple ratings 
within classes, one could assume that these 
ratings would have an acceptable reliability. 


Validity 


On the other hand, validity data are not 
available because there is no “higher” cri- 
terion against which to correlate the OBD. 
However, if “good” human relations in a par- 
ticular society may be defined in terms of 
what its power figures sanction or punish, 
then by definition this test measures, to some 
extent, what it is supposed to measure since 
the items were weighted by Air Force colo- 
nels, who are power figures in their society, 
on the basis of the items’ importance for good 
human relations. Further grounds for sup- 
port of its presumptive validity are: (a) its 
face validity, which is evident upon examina- 
tion of the items, and (+) the fact that the 
items measure the critical incident (4) areas 
which were judged to be related to human 
relations. 


Summary 


The Officer Behavior Description is an ob- 
jective, checklist type of test which meas- 
ures human relations behavior on the job. It 
covers 17 areas which were adapted from an 
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earlier critical-incident study in the Air Force. 
For construction of the OBD a group of 282 
Field Grade Officers attending the Air Uni- 
versity were described by a subordinate, a 
peer, and a superior officer at the subject offi- 
cers’ last duty station. The descriptions were 
scored using weights assigned by a sample of 
colonels attending the Air War College. Di- 
rect measures of validity and reliability were 
unobtainable, although related data and con- 
siderations suggest that if validity and reli- 
ability statistics were available the coefficients 
would be satisfactory. 


Received July 16, 1956. 
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An Analysis of J-47 Jet Mechanic Checklist Responses 
for Response Set and Consistency ‘ 


Willard E. North 


Personnel Laboratory, Air Force Personnel and Training Research Laboratory 


Introduction 


Agencies dealing with such problems as se- 
lection, placement, training, and job classifi- 
cation frequently need information related to 
the work activities that personnel are re- 
quired to perform within a specified job. 
Such information is usually concerned with 
the number of people performing a given task, 
the extent of specialization within a job, the 
frequency with which certain tasks are per- 
formed, and the amount of supervision re- 
quired to perform a task. 

The checklist survey method appears to be 
a good procedure for obtaining this kind of 
information on large samples. Some of the 
reasons for believing that it has certain ad- 
vantages over the more typical forms of job 
analysis are: 

1. A larger, more representative sampling 
of job incumbents is feasible. 

2. Standardized responses are required from 
all incumbents, thereby making analysis easier. 

3. Activity statements are presented to the 
subjects in such a manner that a wide variety 
of judgments can be elicited. 

4. Wide sampling with checklist permits the 
study of within-job specialization and varia- 
tions in the job due to differences in base lo- 
cations and missions. 


Scope of this Study 


The J-47 Jet Mechanic Checklist was de- 
veloped as a part of a larger project. The 
purpose of the over-all project was to de- 
velop a proficiency test for the J-47 mechanic. 


1 This report is based on work done under ARDC 
Project No, 7950, Task No. 77252, in support of the 
research and development program of the Air Force 
Personnel and Training Research Center, Lackland 
Air Force Base, Texas. Permission is granted for 
reproduction, translation, publication, use, and dis- 
posal in whole and in part by or for the United 
States Government. 


The checklist and the responses obtained from 
a sample of mechanics were to be used as aids 
in item construction. The checklist also was 


‘to serve as one instrument in a subsequent 


test evaluation procedure. 

It was decided that an initial study should 
be made to determine the care with which 
subjects made their responses to items in the 
checklist. Therefore, two hypotheses, one re- 
lated to response set, the other to response 
consistency, were formulated. These two hy- 
potheses were: 

1. Response set will take place in the later 
portion of the checklist. Previous experience 
with checklists had raised some doubt as to 
whether the subjects answered items as care- 
fully at the end of the checklist as they did 
at the beginning. It was believed that the 
subjects possibly became bored part way 
through the checklist, then tended to pick one 
response category and use it consistently. 

2. Consistency of response to items will be 
poorer for those items appearing at the end 
of the checklist than for items appearing at 
the beginning. Consistency of response is de- 
fined as the amount of agreement between the 
response that a mechanic makes to an item 
on the checklist and the response that he 
makes to that same item in a follow-up inter- 
view. This hypothesis is related to the first. 
If response set occurred in the checklist there 
should be a corresponding deterioration in 
agreement. On the other hand, even if there 
were no response set, deterioration in agree- 
ment might still take place. This could oc- 
cur if there was a tendency to give random 
responses due to boredom or fatigue. 

This pilot study served one additional pur- 
pose. It was assumed that the checklist cov- 
ered the mechanic’s job completely. This as- 
sumption could be partially verified by having 
the subjects write in any task they performed 
that had been omitted on the checklist. 
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Procedure 


Construction of the J-47 Jet Mechanic Check- 
list 


Nine job analysts, usually in pairs, visited nine Air 
Force bases to gather information about the nature 
of the task activities of J-47 jet mechanics. A total 
of 39 jet mechanics were interviewed on an indi- 
vidual basis by an analyst. The interviewees worked 
on one of the following aircraft: F-86, B-36, B-47, 
or F-86D, all of which are equipped with the J-47 
engine. 

Task statements were developed by each analyst 
from each of his interviews. These task statements 
from the 39 interviews were placed on cards, one per 
card. Two job analysts and three J-47 technical ex- 
perts, working independently, served as judges to 
develop task lists from these task statements. Since 
many of the job analysis interviews produced simi- 
lar task statements, it was the individual judge’s job 
to group these tasks and write a covering statement 
In general, agreement among the judges was good 
Those disagreements that did exist were resolved by 
the judges in committee action. As a result of this 
procedure a master checklist containing 220 items 
was prepared from the five individual task lists 

The items were designed to elicit information about 
the frequency with which each task is performed and 
about the amount of assistance given or received in 
the performance of each task. It was thought that 
responses regarding the frequency of task perform 
ance might differ if responses were expressed in fre 
quency units rather than in frequency-per-time-in 
terval units. Therefore, two forms were developed 

The Absolute Scale required responses to frequency 
of task performance within the last three months 
The following scale values were assigned: 


0 I do not perform task. 
1 1 to 2 times. 

2 3 times 

3 4 to 14 times 

4 15 to 60 times. 

5 60 times or more often. 


The Graphic Scale required responses to frequency 
of task performance per time interval of scale inter- 
vals which were arbitrarily assigned values as fol 
lows: 


0 Ido not perform task 
Less than once a month. 
Once a month 
More often than 
often than once a week 

More often than once a week, but not more often 
than once a day 

More often than once a day. 


once a month, but not more 


The response categories proffered for obtaining in- 
formation about the amount of assistance given or 
received in the performance of tasks were assigned 
scale values as follows: 


0 Ido not perform this task. 

1 I perform this task with technical assistance. 

2 I perform this task without technical assistance. 
3 I supervise this task. 


This four-point scale hereafter will be called the 
Level of Performance Scale. These level-of-perform- 
ance response categories were appended to both fre- 
quency scales. The combination of the Absolute 
Frequency Scale and the Level of Performance Scale 
will hereafter be designated as the Absolute Form, 
and the Graphic Frequency Scale and Level of Per- 
formance Scale as the Graphic Form 

In using either of the two forms, subjects were in- 
structed to think of how often they had performed 
the task in the last three months, The frequency- 
response categories were equated in both scales 
Since the scales probably do not belong to the same 
metric, however, no between-scale comparisons have 
been made. The two hypotheses were tested on the 
two forms separately 

The 220 items were then divided into four sections 
of 55 items each. These four sections were labeled 
A, B, C, and D. Checklist booklets were prepared 
counterbalancing the sections so that four orders of 
presentation could be used. This procedure was fol- 
lowed for both frequency scale forms. These four 
orders of presentation and section position in check- 
list are: 

Order of 

Presentation Sec tion Position 

4 3 
I i i 
II B 
If ; A 
IV ; D 


Administration of the Checklist 


The checklists were administered to 70 J 47 me 
chanics at one Air Force base. In order to be in- 
cluded in the sample, the subjects had to have been 
working as jet mechanics at their 
throughout the three preceding months 
list was administered in 


present base 

The check- 
four sessions over a two 
day period, one period each morning and one pe- 
riod afternoon. Each form was administered 
in one morning session and one afternoon session 
The checklist booklets were distributed to the me 
chanics randomly 


each 


The Follow-up Interview 
At the end of the 
made of all items to determine the percentage of 


first testing day a tally was 
mechanics performing each task. From those tasks 
having percentages of between 30 to 70 per cent, 50 
items were selected. Twelve items were selected 
from each of Sections A and D, and 13 from each of 
Sections B and C. Within a section, items were se 
lected randomly from those items having percent- 
ages between 30 and 70. Lists composed of these 50 
tasks were prepared for use by four interviewers 
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Thirty-nine of the original 70 mechanics were in- 
terviewed on the 50 selected tasks. The purpose of 
the interview was not only to determine the con- 
sistency of checklist response, but also to gain some 
clinical insight into how the mechanics estimated the 
frequency of performance. All interviews were made 
within one to two days after the mechanics had com- 
pleted the checklist. These interviews were tape re- 
corded. 


Results 


In order to determine whether a response 
set was occurring, a change of response score 
was computed for each individual on each 
section of the checklist. 

This score was computed by first noting the 
response that the mechanic made on the first 
item of a section. Each time he made a 
change in either his level or frequency of per- 
formance response to a subsequent item the 
mechanic was given a score of one. Thus, if 
the mechanic gave the same response to each 
item throughout a given section, he received 
a score of zero. If he made a change of re- 
sponse on each subsequent item in the sec- 
tion, he would receive the maximum score of 
54. 

The checklist had been administered to 70 
mechanics. Eight of these were dropped from 
the sample because they did not meet the re- 
quirement of having spent three months on 
the job at that base. Of the remaining 62 
subjects, 30 had taken the Absolute Form and 
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Table 1 
The Mean Number of Response Changes by Section 
Positions for Subjects Taking Either the 
Absolute or Graphic Form 


Section Positions 


Form 1 2 3 


19.21 
17.00 


21.46 
18.42 


22.35 
19.75 


Absolute 
Graphic 


32 the Graphic Form. Each form had one 
order of presentation in which only six sub- 
jects were obtained. In order to facilitate 
the two analyses of variance by using equal 
numbers in each order of presentation, six 
subjects were randomly selected from each of 
the other orders. 

If the hypothesis concerning response set is 
substantiated, there should be a decrease in 
the section-position means between the first 
and last sections taken by the subjects. The 
means presented in Table 1 show a general 
but small decline for both forms across sec- 
tion positions. There is a slight inversion in 
means between Section Positions 3 and 4 on 
both of the forms. 

In order to determine whether these dif- 
ferences in section-position means are signifi- 
cant, two analyses of variance were made. 


Table 2 


Analysis of Variance of the Number of Response Changes by Section (Graphic Form) 


Sources of Variance 


Independent Observations 
Order of presentation 
Residual between individuals 
‘Total between individuals 


Correlated Observations 
Sections 
Section positions 
Residual from Latin square (error) 
Residual within individuals (error) 
Total within individuals 


Total 


* Decimal points were omitted from the original score for ease of computation, 


t No F is significant at the 5° level of confidence 


Sum of 
Squares* 


Mean 
Square 


df 


505.87 
4857.62 
5363.49 


168.62 
242.88 


109.28 
152.28 
128.31 
1807.88 
2197.75 


7561.24 
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Table 3 


Analysis of Variance of the Number of Response Changes by Section (Absolute Form) 


Sources of Variance 


Independent Observations 


Order of presentation 
Residual between individuals 
Total between individuals 


Correlated Observations 
Sections 
Section positions 
Residual from Latin square (error) 
Residual within individuals (error) 
Total within individuals 


Total 


Sum of 
Squares* 


Mean 


df Square 


164.58 
4097.92 
4262.50 


54.86 
204.90 


588.17 
133.75 
297.17 
1746.41 
2765.50 


196.06 
44.58 
49.53 
29.11 


7028.00 


* Decimal points were omitted from original score for ease of computation 


t The F for the section means square is significant at the 1° level of confidence. 


of confidence. 


These summaries are presented in Tables 2 
and 3. 

The design for the analyses of variance 
carried out in this study is given by Edwards 
(1). It is a Latin square replicated. That is, 
instead of just one subject taking each order 
of presentation, several subjects are assigned 
to each order of presentation. Two error 
terms are required because the mean squares 
for order of presentation is based on one to- 
tal score for each subject while the mean 
squares for sections and section positions are 
based on four correlated scores. Tables 2, 3, 
5, 6, 7, and 8, for this reason, have been di- 
vided into two parts. The designation ‘“In- 
dependent Observations” refers to the mean 
squares based on one score per subject. The 
designation “Correlated Observations” refers 
to the mean squares based on four scores for 
each subject. 

To test the hypothesis relating to response 
set, we must consider the F for section posi- 
tions. If a response set took place, this F 
should be significant. The obtained F for the 
Graphic Form is 1.68 and for the Absolute 
Form 1.53. Neither F is significant at the 
5% level of confidence for 3 and 60 degrees 
of freedom. 

The only significant F in these two analy- 
ses is the one for the mean square between 
sections for the Absolute Form. Apparently 


No other F is significant at the 5°) level 


sections 
for the 


some 
others 
Form. 


than 
Absolute 


are more appropriate 
group taking the 


Consistency of Mechanics’ Responses 


It will be recalled that each of the follow- 
up interviews was tape recorded. Five raters 
reviewed independently each of the 39 re- 
corded interviews. On a special rating sheet, 
they recorded the level and frequency of per- 
formance response that they judged the me- 


Table 4 


Mean of the Deviation Squared Scores Made on Level 
and Frequency of Performance Responses 
by Mechanics Taking Either the 
Absolute or Graphic Form 


Level of Task Performance Responses 
Section Positions 
1 2 


Absolute 
Graphic 


87 
6 


1.01 
83 73 


Frequency of Task Performance Responses 
Section Positions 


Absolute 
Graphic 
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Table 5 


Analysis of Variance of the Mechanics Deviation Squared Scores on Level of Performance Responses 


(Absolute Form) 


Sources of Variance 


Sum of 
Squares* 


Independent Observations 
Order of presentation 
Residual between individuals 
Total between individuals 


Correlated Observations 
Sections 
Section positions 
Residual from Latin square (error) 
Residual within individuals (error) 
Total within individuals 


Total 


53,120.5 17,706.8 


116,037.0 14,504.6 
169,157.5 


8,984.5 2,994.8 
12,133.2 ; 4,044.4 
12,019.8 2,003.3 
67,697.0 2,820.7 

100,834.5 


269,992.0 


* Decimal points were omitted from original scores for ease of computation. 


{No F is significant at the 5% level of confidence. 


Table 6 


Analysis of Variance of the Mechanics Deviation Squared Scores on Level of Performance Responses 


(Graphic Form) 


Sources of Variance 


Independent Observations 
Order of presentation 
Residual between individuals 
Total between individuals 
Correlated Observations 
Sections 
Section positions 
Residual from Latin square (error) 
Residual within individuals (error) 
Total within individuals 


Total 


Sum of Mean 
Squares* j Square 


8,205.5 2,735.2 
125,879.4 12 10,490.0 
134,084.9 


5,519.8 : 1,839.9 
5,298.2 1,766.1 
10,858.2 1,809.8 
88,029.3 : 2,445.2 
109,705.5 


243,790.4 


* Decimal points were omitted from original scores for ease of computation. 


1 No F is significant at the 5% level of confidence. 


chanic should have made to each of the 50 
items. 

For purposes of analysis, the level of per- 
formance and the frequency of performance 
responses were treated separately. Arbitrary 
scale values were assigned to each level using 
the values that have been previously indi- 
cated. For each type of response on each 


item, the raters’ judgments were converted to 
the arbitrary scale values previously indi- 
cated. The mean of the five raters’ scale 
values was then obtained for each item. The 
mechanics’ checklist responses were converted 
also to the appropriate scale values. The dif- 
ference between the scale value of a checklist 
response and the mean scale value of the five 
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raters’ judgments was then determined. This 
difference was squared to eliminate negative 
signs. The section consistency score for each 
mechanic was obtained by summating these 
squared differences for all items within a sec- 
tion and dividing by the number of items in 
that section. 

Since it was not possible to designate which 
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subjects were to return for the follow-up in- 
terviews, unequal numbers of subjects were 
obtained for the various orders of presenta- 
tion. Of those subjects taking the Absolute 
Form for one order of presentation, only 
three subjects were interviewed, while the 
Graphic Form had one order in which only 
four subjects were interviewed. Subjects were 


Table 7 


Analysis of Variance of the Mechanics Deviation Squared Scores on Frequency of Performance Responses 


(Absolute Form) 


Sources of Variance 


Independent Observations 


Order of presentation 
Residual between individuals 


Total between individuals 


Correlated Observations 
Sections 
Section positions 
Residual from Latin square (error) 
Residual within individuals (error) 
Total within individuals 


Total 


Sum of Mean 

Squares* df Square It 
119,873.2 3 39,957.7 

606,202.9 s 75,775.4 

726,076.1 11 

54,545.3 3 18,181.8 1.59 
37,761.7 3 12,587.2 1.10 
96,831.0 6 16,138.5 1.41 
275,163.8 24 11,465.2 

464,301.8 36 

1,190,377.9 47 


* Decimal poinss were omitted from original scores for ease of computation 
t No F is significant at the 5% level of confidence. 


an 
rable 8 
Analysis of Variance of the Mechanics Deviation Squared Scores on Frequency of Performance Responses 


(Graphic Form) 





Sum of Mean 
Sources of Variance Squares* df Square Ft 

Independent Observations 

Order of presentation 145,640.3 3 48,540.8 2.33 

Residual between individuals 249, 847.3 12 20,820.6 

Total between individuals 395 487.6 15 
Correlated Observations 

Sections 25,004.8 3 8,534.9 

Section positions 24,580.2 3 8,193.4 

Residual from Latin square (error) 102,115.3 6 17,019.2 

Residual within individuals (error) 740,675.5 36 20,574.3 

Total within individuals 892,975.8 48 

Total 1,288 463.4 63 


* Decimal points were omitted from original scores for ease of computation 
t No F is significant at the 5% level of confidence 
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drawn randomly from the other orders to 
equate the numbers of subjects in each order. 
This resulted in 12 and 16 subjects for the 
Absolute and Graphic Forms, respectively, 
that were available for the analyses of vari- 
ance, 

The section-position means for the mechan- 
ics are presented in Table 4. These means 
are based on the number of cases used in the 
four analyses of variance to be presented 
later. The means shown in Table 4 change 
in the direction hypothesized. That is, the 
means in all cases are larger for Section Posi- 
tion 4 than for Section Position 1, and there 
is a general tendency for the means to in- 
crease as a section appears later in order. 
However, several inversions may be noted. 

To determine if any of the differences in 
means are significant, an analysis of variance 
was made for each type of response on each 
form. The summaries of these four analyses 
are presented in Tables 5, 6, 7, and 8. 

The hypothesis of a decrease in mechanic 
consistency across section positions is tested 
by the section-position mean squares. None 
of the Fs for the four section-position mean 
squares is significant at the 5% level of con- 
fidence. The observed increases in means 
must be considered as having occurred by 
chance. 

The adequacy of checklist coverage was to 
be judged by the number of task statements 
written in by mechanics concerning activities 
performed by them and not covered by the 
checklist. Very few statements were written 
in. These were reviewed by two job analysts 
and were judged to have actually been cov- 
ered by the checklist. 
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Summary and Conclusions 


A checklist containing 220 work activity 
statements was constructed. Two forms of 
the checklist were prepared, each of which 
associated level of performance with a differ- 
ent frequency of performance scale. The sec- 
tions within each form were presented in 
counterbalanced orders. The two forms were 
then presented to a total of 70 mechanics, ap- 
proximately half taking each form. Thirty- 
nine of these mechanics were recalled and in- 
terviewed on 50 of the 220 items. Analyses 
of the mechanics’ responses were made to de- 
termine whether or not a tendency toward re- 
sponse set occurred, the degree of consistency 
between checklist response and interview re- 
sponse to the same item, and the adequacy of 
checklist coverage. 

Based on a very limited number of cases, 
the following conclusions seem warranted: 


1. No evidence of response set was found. 

2. No evidence was found to indicate that 
response consistency was better at the begin- 
ning of the checklist than at the end. 

3. The checklist appears to cover the job 
adequately. The fact that there were no sig- 


nificant contributions by the mechanics does 
not guarantee that coverage was complete. It 
strongly suggests, however, that no major 
area had been omitted. 
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Combat Performance: Measurement and Prediction ‘ 


David K. Trites and Saul B. Sells 
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This report describes a research project un- 
dertaken to evaluate the measurement of com- 
bat effectiveness of Air Force pilots and the 
prediction of these measures by data collected 
while the pilots were in flight training. This 
project is part of the research program of the 
Department of Clinical Psychology, School of 
Aviation Medicine, USAF, leading to the de- 
velopment of adaptability screening instru- 
ments for the selection of aircrew personnel. 

Specifically, answers were sought to the fol- 
lowing questions: (a) Are ratings of combat 
performance and adjustment by peers, su- 
periors, and psychologists related to objective 
performance data? (b) To what extent are 
peer-superior ratings related to psychologist 
ratings which are in part based upon informa- 
tion obtained from peers and superiors? (c) 
Are combat criteria predictable by precombat 
criteria of performance and adjustment? 


Procedure 


In 1951, and again in 1953, research teams? were 
sent to the Far Eastern Theater of Operations by 
the School of Aviation Medicine to obtain informa- 
tion on the field performance of pilots who previ- 
ously had been administered a test battery at the 
beginning of pilot training. Because of difficulties 
encountered in locating subjects and the necessity of 
collecting data in a manner which would not inter- 
fere with operations, only a relatively small number 
of subjects could be studied. And even among these 
it was frequently impossible to collect all the desired 
information. A further restriction was placed on the 
number of usable cases by the completeness of train- 
ing level data for cases located. Consequently, the 
present analysis had to be based on only 65 subjects 
for whom relatively complete overseas and training 
data had been obtained. It is realized that generali- 
zation of results from such a small sample is a haz- 
ardous undertaking. However, the findings of this 
research do permit tentative answers to the ques- 


1This paper was presented in slightly modified 
form at the convention of the American Psychologi- 
cal Association, San Francisco, California, in Sep- 
tember, 1955 

2 Captain Charles A. Downing (USAF, MSC) and 
Lieutenant John G. Bradley (USAF), psychologists, 
in 1951, and Captain Blair W. Sparks (USAF), psy- 


chologist, and Captain Frederick Hinman (USAF, 


MC) in 1953. 


tions posed and suggest certain hypotheses for fu- 
ture research. 

The combat and training level data can be grouped 
into three types: objective data, peer-superior evalu- 
ations, and clinician ratings. The objective informa- 
tion collected overseas may be defined as enumera- 
tion data, representing number of hours flown in 
combat and number of noncombat flying hours. Ob- 
jective training data may be defined as composite 
scores resulting from relatively objective measuring 
devices or as biographical information, such as age. 
Peer-superior evaluations are ratings or comments 
made by a man’s associates and commanding officers 
upon request. Such raters are usually not specifically 
trained to make evaluations of this type, but having 
observed the behavior of the men to be rated, they 
can furnish valuable data. Clinicians, by definition, 
have as their primary mission the evaluation of men 
and are trained specifically for the task. In present- 
ing the results of this investigation, the 21 variables 
analyzed are grouped according to this threefold 
classification, 

Table 1 contains a short description of the vari- 
ables grouped by category. Eleven of these vari- 
ables were collected overseas, the remaining 10 while 
the men were in flight training. The analysis is in 
terms of correlations. 


Results and Discussion 


To evaluate the first question, the peer- 
superior and psychologists’ ratings were cor- 
related with the two objective performance 
variables. These correlations are presented 
in Table 2. 

Considering first the peer-superior mean 
rating, it can be seen that there is a signifi- 
cant positive relationship with total combat 
hours and an insignificant negative relation- 
ship with total noncombat hours. With the 
exception of the responsibility rating correla- 
tions, this same pattern of positive relation- 
ships with total combat hours and negative 
relationships with total noncombat hours holds 
for all the peer-superior ratings. In general, 
then, it would appear that a pilot’s peers and 
superiors react positively to his combat ef- 
forts and negatively to his noncombat flying. 
This reaction is most apparent in the specific 
ratings of competence, fairness, and courage. 
The significant, positive correlations between 








122 David K. Trites and Saul B. Sells 


Table 1 
Description of Variables 


COMBAT 


Objective Performance Data 


Total Combat Flying Time (Hours): Self-explanatory, standard score. 
Total Overseas Flying Time minus Total Combat Flying Time: Self-explanatory; standard score. 


Peer-Superior Ratings 
A booklet containing 50 pairs of bipolar descriptions was given to a subject’s peers and superiors. Responses 
were on a seven-point scale, ranging from 1 (agreement with the negative statement) to 7 (agreement with the 
positive statement). Six trait scores were derived by matching each of the 50 items with a corresponding item 
in a scale factor analyzed by Roff (3). They represent six of the seven factors extracted in his study and are 
presented here in a paraphrase of his titles and descriptions. All scores except Trait 6 have been converted to 
a nine-point normalized scale; high score = positive end of scale. 


Mean Rating: Average of responses to all 50 items by each rater for each subject. 


Trait 1—Competence in Combat Flying: Items refer to various aspects of flying combat performance, judgment, 
flying skill, knowledge, planning, responsibility for others in combat, air discipline, and combat briefing. 
Trait 2— Fairness: \tems refer to type of behavior ordinarily considered to indicate integrity and moral character. 


Otlicers rated high on this trait would be those whose subordinates could feel assurance of just, honest, and un- 
selfish treatment in such matters as assignment to missions, promotions, etc. 


Trait 3—Courage: Items indicate courageous behavior and a sharing of dangers. 

Trait 4—Res ponsibility (General Duties): Items refer to conscientiousness about preparation for combat, general 

enthusiasm regardless of the task, and a willingness to deal with necessary administrative work. 

Trait 5 —Likability: Items refer to characteristics relating to getting along well with other persons. 

Trait 6—Discipline: One item referring to strictness of ground discipline. Correlated in raw score form. 
Psychologists’ Ratings 

Symptom Rating: A statement by the psychologists that a subject exhibited behavior which could be interpreted 

as pathological, Subjects were divided into two groups, those without symptoms and those with symptoms. 


Final Rating: An evaluation on a four-point scale—Superior, Pass, Marginal, and Fail—of subjects’ over-all 
adjustment and performance in combat. Subjects were divided into two groups: superior versus the other three 
categories. 


TRAINING 


Objective Data 
Pilot Stanine: A score representing an estimate of a subject’s aptitude for flying derived from a battery of paper- 
and-pencil and psychomotor tests. Scores presented on a stanine scale; high score = much aptitude. 
Officer Quality Stanine: A score representing an estimate of a subject’s officer-like qualities derived from the same 
battery of tests as the pilot stanine. Sometimes interpreted as a measure of intelligence. Scores presented on 
a stanine scale; high score = much officer-like quality. 


Age: Age upon entrance to pilot training. 


Peer-Superior Ratings 
The Military Aptitude Ratings (MAR) are evaluations by a man’s peers and superiors of his military aptitude. 
Evaluations are made on a 25-item checklist, each item having five points. Ratings are summed over raters for 
each subject and converted to a normalized nine-point scale. High score = much military aptitude, as rated. 


Weighted Average MAR: An average of each subject’s ratings by all raters in which the ratings by the Tactical 
Department officers receive the highest weight. 
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Classmates MAR: Ratings by a subject’s classmates. 
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Table 1 (Continued) 


Upperclass MAR: Ratings of a subject by the men in the class entering training just prior to his class 


Tactical Department MAR: Ratings of a subject by his tactical officers (military training instructors) 


Flying Department MAR: Ratings of a subject by his flying instructors. 


Psychologists’ Ratings 


Adjustment Group (1): An over-all rating by a clinician using all information available on a subject, including 
original evaluation by a psychologist, military aptitude ratings, number of demerits, flying and academic grades, 


etc. 


Subjects are rated on a four-point scale, ranging from / (very well adjusted to training and with a good 


prognosis for the future) to 4 (very poorly adjusted to training and a poor prognosis for the future). 


Adjustment Rating (4): A rating assigned to a subject by a clinician using individually administered psycho 


logical tests and an interview as the basis for evaluation of adjustment, motivation, and prognosis. 


Scores are 


on a'nine-point scale, ranging from / (inadequate motivation and personality) to 9 (excellent motivation and per 


sonality). 


total combat hours and the fairness and cour- 
age ratings and the significant negative rela- 
tionship between fairness and total noncom- 
bat hours can be interpreted as _ revealing 
feelings of resentment and disparagement to- 
ward those not carrying their fair share of 
combat flying. 

Since the correlation between the compe- 
tence rating and total combat hours is negli- 
gible, while the correlation between compe- 
tence and total noncombat hours is negative 
and significant beyond the .05 level, it may 
be that combat flying is not perceived as be- 
ing related to flying competence, but that an 
excessive amount of noncombat flying is a 
mark of incompetence. This may be in- 
dicative of a selection process whereby the 
obviously less competent flyer is assigned 
more routine and less hazardous flying tasks. 
Such an assignment will usually result in an 
increased number of flying hours because the 
less hazardous aircraft are those with longer 
ranges and are used more frequently for rou- 
tine administrative flying. 

One possible explanation of these relation- 
ships is in terms of a generalized perception, 
or halo effect. The relatively incompetent 
pilot who accrues many overseas noncombat 
flying hours is seen as lacking in all respects; 
whereas the pilot with much combat flying is 
rated as better in all respects except compe- 
tence. The lack of relationship between the 
competence rating and combat flying can be 


attributed to homogeneity of skill within the 
group of combat flyers. 

The relationships between the two objective 
performance variables and the two ratings by 
psychologists indicate that combat flying time 
was a correlate of their ratings. This is espe- 
cially true of the final rating. Since the cor- 
relations here are biserial, based on the di- 
chotomy of superior vs. pass, marginal, and 
fail, it is obvious that the men with the most 
combat flying were more often considered su- 
perior. 

The symptom rating correlations are not as 
pronounced as those with the final rating but 
do suggest that an absence of behavioral 
manifestations indicative of some pathologi- 
cal process is positively related to the total 
amount of combat flying. In other words, in 
this group of pilots there was more of a tend- 
ency for symptomatology to be evident among 
pilots who did little combat flying. It is pos- 
sible that men without overt symptomatology 
are demonstrating better adaptability to the 
Air Force and, in this situation, to the rigors 
of combat than are men with symptoms. If 
this be true, the symptom-free pilot would be 
expected to be a better pilot and more fre- 
quently selected for combat missions than 
pilots with symptoms. In the extreme, those 
men with symptoms would be completely dis- 
abled and would be removed from flying 
status altogether. 

Active combat duty can also be considered 
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Table 2 


Correlations of Peer-Superior and Psychologists’ Ratings Collected in Combat with Combat 
Objective Performance Variables (Hours) 


Hours 


Combat 


Ratings r* N r 


Peer-Superior 


Mean 36* 42 


Competence 04 41 


Fairness 33* 


41 


Courage 41* 41 


Responsibility 19 35 


Likability 41 


Discipline 38 


Psychologists’ 
Symptom 21 


20 


—07 


Final 42* 17 


24 


00 


1 All correlations with the psychologists’ symptom and final 
is the frequency in the “good” or “high" group. 
* Decimal points have been omitted from all correlations. 
# Indicates significance beyond the .10 level, 
* Indicates significance beyond the .05 level. 
** Indicates significance beyond the .01 level. 


as an outlet for the tensions created by the 
war. A man with much opportunity to “work 
out” his tensions in open aggression against 
the enemy might reduce them below the 
symptom-producing level. Contrasted with 
these men would be pilots having less active 
combat flying. Without the opportunities af- 
forded by combat for releasing tensions, this 
latter group might be expected to have a 
higher incidence of symptomatic behavior. 

A third explanation is also tenable. The 
psychologists may have been biased in their 
perception of symptoms by the amount of 
combat flying time obtained by the pilots. 
Symptoms in a man with much combat might 


Noncombat 


Psychologists’ Ratings! 


Symptom Final 


Tr N To 


N N 


35° 23 


22 


23 
21 


23 
21 


17 


28 


28 
28 


28 
23 
21 
22 
17 


28 


14 
25 
23 
20 


15 
28 


25 


17 
22 


17 
22 


ratings are biserial, The upper N for each biserial correlation 


be overlooked, while symptoms in a man with 
little time might be perceived because of the 
sensitization of the psychologists resulting 
from a feeling that little combat must be in- 
dicative of some maladjustment. 

To evaluate the second question, the rela- 
tionships presented in Table 2 between the 
peer-superior ratings and the psychologists’ 
evaluations in the combat situation were ex- 
amined. It can be seen that all the correla- 
tions, except one, are statistically significant 
beyond the .10 level, and the majority be- 
yond the .01 level, indicating much agree- 
ment between the two types of evaluations. 
Whether or not this agreement is a spurious 
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result of the psychologists’ knowledge of the 
opinions of peers and superiors cannot be de- 
termined. The correlations between the symp- 
tom and final rating and the peer-superior 
ratings are far from unitary, indicating that 
the psychologists are contributing some uniqge 
variance in their evaluations of the subjects’ 
overseas and combat performance. Granting 
complete objectivity of the psychologist, the 
relationships between his ratings and the peer- 
superior evaluations would be a result of com- 
mon determinants and not contamination. In 
any event, agreement between the two types 
of ratings does exist. 

The psychologist symptom rating is most 
highly related to the peer-superior likability 
rating. Apparently in individuals considered 
likable, deviant behavior is relatively absent 
or, at least, is not perceived as readily by the 
raters. The constellation of traits creating 
likability seems to be a pervasive one and, as 
will be seen later, an enduring characteristic. 

Among the psychologist final rating corre- 
lations, the largest relationship is with the 
peer-superior rating of courage. Since both 
of these variables are also relatively highly 
correlated with total combat hours, the de- 
pendence of personal evaluations upon the 
amount of combat exposure is again sug- 
gested. 

The preceding results have clearly demon- 
strated a large degree of communality be- 
tween the measurements of overseas and com- 
bat performance by objective variables, peer- 
superior ratings, and psychologist evaluations. 
It can be concluded that ratings by peers, 
superiors, and psychologists are related to the 
objective performance items and that the psy- 
chologist’s evaluations are definitely related to 
peer and superior officer opinions. 

The final question deals with the predicta- 
bility of combat criteria. Attempts to pre- 
dict combat and overseas performance from 
measurements collected prior to combat have 
covered a wide range of predictive devices. 
This discussion will be limited to a number 
of variables utilized in this research program 
for preliminary evaluation of experimental 
adaptability screening tests. The results are 
presented in Tables 4, 5, and 6. 

In the following interpretation and discus- 
sion, it should be noted that most of the train- 
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ing level variables have a restricted range in 
the combat sample relative to the range pres- 
ent in the total group of entrants into training 
or the total group of graduates from training. 
The extent of this restriction may be judged 
from Table 3 which contains the variances 
based on all entrants into training, all gradu- 
ates from training, and the smallest and larg- 
est variances obtained in the correlations of 
the training level variables with the combat 
variables. The F tests of the significance of 
the differences between the combat sample 
and training level groups indicate relatively 
little restriction when the largest combat 
variances are used. Also there is less re- 
striction apparent when the variances for to- 
tal graduates from training are compared with 
the combat sample variances. 

Because of this fluctuation in the combat 
sample variances, the small number of cases 
available, and the two different training level 
reference groups, each of which could be used 
as a base for judging range restriction, cor- 
recting the obtained correlations for the re- 
striction did not seem to be justified. Hence 
all of the correlations reported in Tables 4, 
5, and 6 are uncorrected. 

The first of the objective variables investi- 
gated was the pilot stanine score. This is a 
measure designed to assess the aptitude for 
flying of applicants for training. A high score 
indicates a high aptitude and good prognosis 
for success in the training program; a low 
score indicates a poor prognosis. It has been 
used extensively in the present research pro- 
gram to equate high and low adaptability 
samples on aptitude for flying. 

Theoretically, a score of this kind would be 
expected to have positive correlations with 
overseas and combat variables which are a 
function of achievement or competence in fly- 
ing. In Table 4 the negligible size of the 
correlations with the two objective perform- 
ance variables indicates little relationship with 
actual flying. ‘The correlations of pilot stanine 
with the other overseas variables are all in ac- 
cord with expectation in terms of the direction 
of relationship and generally are larger than 
the correlations with the objective perform- 
ance variables. Although only one reaches an 
acceptable level of statistical significance, this 
general agreement in sign suggests that pilot 
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Table 4 


Correlations Between Training Level Objective 
Variables and Combat Variables 


Pilot Officer Quality 
Stanine Stanine Age 
Combat 
Variables r N r N r N 
Hours 
Combat 01 38 12 36 Ol 49 
Noncombat 09 37 01 35 09 48 
Peer-Superior 
Ratings 
Mean 23 «44 21 «43 02 55 
Competence 12 43 15 43 — 04 5 
Fairness 15 43 21 43 05 55 
Courage 14 43 03 43 —18 55 
Responsibility 14 37 28% 37 30* 47 
Likability 2041 20 «Ad a 
Discipline 13. 40 272 40 am 333 
Psychologists’ 
Ratings* 
20 20 25 
Sy 5 312 ( 
ymptom 05 0 31 18 | 27 
17 16 20 
Ping 352 ( 
Final 35 23 O08 2 8 32 


1 Decimal points have been omitted from all correlations 

27 All correlations with the psychologists’ symptom and final 
ratings are biserial; all others are product-moment. The upper 
N for each biserial is the frequency in the “good” or “high” 
group. 

# Indicates significance beyond the .10 level, 

* Indicates significance beyond the .05 level, 


stanine may have some slight predictive value 
for combat criteria based on ratings. 

The second objective predictor variable is 
the officer quality stanine. This is a score 
similar to the pilot stanine in its derivation 
and scaling but is a measure of factors such 
as general intelligence which are thought to be 
related to a man’s potential effectiveness as an 
Air Force officer. This variable would be ex- 
pected to have positive relationships with 
overseas and combat measures reflecting com- 
petence as an officer. 

The correlations of the objective perform- 
ance variables with officer quality stanine are 
negligible. On the other hand, the correla- 
tions of officer quality stanine with the peer- 
superior and psychologist ratings are all in 
the anticipated direction with three of these 
significant beyond the .10 level. Of these, 
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the responsibility and discipline ratings are 
the ones which intuitively seem to be the 
most direct reflection of effectiveness as an 
officer. The fact that the original assessment 
was made at least two years before the col- 
lection of the ratings in the combat theater of 
operations makes the relationships even more 
remarkable. Apparently there is a continuity 
of behavior extending over a long period of 
time which can be measured prior to training 
and subsequently assessed by peer-superior 
evaluations collected in an operational situa- 
tion. 

In any group of young pilots, such as those 
used in this research, there is a very restricted 
range of age. Such homogeneity will reduce 
the magnitude of the relationships with other 
variables. Consequently, finding a correla- 
tion between the peer-superior rating of re- 
sponsibility and age significant beyond the 
.O5 level is surprising. It does indicate, how- 
ever, that the older the pilot the more likely 
he is to be perceived by others as being re- 
sponsible. 

Examination of Table 5 reveals that 7 of 
the 10 correlations of the military ratings with 
the objective performance variables have 
negative signs. Only the correlations with 
the flying department ratings are consistently 
in agreement with expectation, 

One possible explanation of this would be 
to attribute factorial complexity to the mili- 
tary aptitude rating form, permitting some- 
what different interpretations or use of the 
items by the different classes of raters. If 
this were so, then the correlations suggest 
that the amount of combat and noncombat 
flying is determined to some extent by fac- 
tors which tend to be negatively associated 
with training level evaluations by classmates, 
upperclassmen, and tactical officers.® 

This hypothesis of factorial complexity is 
supported in the present data by the correla- 
tions of the military aptitude ratings with the 


* The factorial complexity of the military aptitude 
ratings by classmates, upperclass, tactical department 
officers, and flying department officers has been dem 
onstrated by Kubala (2). In his factor analysis of 
training level data involving these ratings among 
other variables, it was found that classmates, upper 
class, and tactical department ratings were loaded on 
a factor orthogonal to the factor on which flying de 
partment ratings were most heavily saturated 
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Table 5 


Correlations Between Training Level Peer-Superior Ratings and Combat Variables 





Average Classmates 
Combat Variables ri N r N 
Hours 
Combat —21 32 —28 32 
Noncombat Gc 3i —15 31 
Peer-Superior Ratings 
Mean 21 «34 11 34 
Competence 13. 33 20 «33 
Fairness 12 33 a6 6383 
Courage 02 33 —20 33 
Responsibility i428 19 28 
Likability ss” 33 Ss* 33 
Discipline 20 3 05 31 
Psychologists’ Ratings* 
Symptom 38* “a a 4 
Final 40* od 18 us 


17 17 


! Decimal points have been omitted from all correlations. 


2 All correlations with the psychologists’ symptom and final ratings are biserial; all others are product-moment. 





Upperclass Tactical Flying 
r N r N r N 
—28 32 —14 32 06 32 
—30% 30 —04 30 30% 31 
29 «32 O08 33 28 34 
34% 31 Ol 32 0 33 
06 31 i2 32 16 33 
21 = 31 —06 32 07 833 
19 26 —07 27 28 8628 
47** 31 24 «32 14-33 
13. 29 Ol 30 2131 
14 14 ~, 14 
26 16 16 15 37 a 16 
13 13 : 

3 5 
tow 7 mg 


The upper 


N for each biserial is the frequency in the “good” or “high” group. 


# Indicates significance beyond the .10 level, 
* Indicates significance beyond the .05 level. 
** Indicates significance beyond the .01 level. 


combat peer-superior ratings. Thirty-two of 
the 35 correlations in Table 5 are positive 
and 3 (weighted average, classmates, and up- 
perclass) are significant beyond the .05 level 
—all with the peer-superior rating of likabil- 
ity. It will be recalled that a number of the 
combat peer-superior ratings were highly re- 
lated to the objective performance variables, 
but that the likability rating was not one of 
them. This indicates that ratings by class- 
mates and upperclassmen are influenced con- 
siderably by whatever attributes a man may 
have which make him likable; that these at- 
tributes are apparently enduring and are re- 
flected in ratings made by peers and superiors 
at a later time; and that the characteristics of 
likability are marginally, if at all, related to 
either training level evaluations by flying 
instructors or later combat performance as 
measured by flying time. A continuity of 
likability is not unexpected, but it is not 
often that it can be demonstrated over a long 








period of time and in two grossly different 
situations. 

In addition, the correlations of the training 
level ratings by the flying department with 
the various combat variables and the small 
relationship between combat likability ratings 
and combat or noncombat flying support the 
hypothesis that opportunity to accrue com- 
bat experience is relatively independent of a 
man’s likability, and that evaluations of a 
man by his associates in combat are not en- 
tirely determined by his likability. 

Out of all the overseas and combat vari- 
ables presented, the psychologist symptom 
rating and final rating have relatively the 
largest number of significant correlations with 
the military aptitude ratings. The fact that 


the symptom rating is associated with the 
classmates rating beyond the .01 level of sig- 
nificance, with the weighted average rating 
beyond the .05 level of significance, and with 
the flying department rating beyond the .10 
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Table 6 


Correlations Between Training Level Psychologists’ 
Ratings and Combat Variables 


Adjustment 
Group! 


Adjustment 
Rating 


Combat Variables r N 
Hours 
Combat 
Noncombat 


Peer-Superior Ratings 

— 32* 
—26 
— 38* 
—13 
—20 
— 29% 
—24 


Mean 
Competence 
Fairness 
Courage 
Responsibility 
Likability 
Discipline 


Psychologists’ Ratings* 
18 


=" 
40 21 


Symptom 


18 
21 


Final 


1 
—43* ( 
43 3 10 


Since good adjustment is associated with a numerically 
small adjustment group (AG) score and poor adjustment with 
a large AG score, negative correlations indicate positive asso- 
ciation between AG and the other variables. 

? Decimal points have been omitted from all correlations. 

4 All correlations with the psychologists’ symptom and final 
ratings are biserial; all others are product-moment, The upper 
N for each biserial is the frequency in the “good” or “high” 
group. 

# Indicates significance beyond the .10 level, 

* Indicates significance beyond the .05 level. 


level, suggests another type of personality 
continuity. It is possible that those person- 
ality characteristics which the psychologists 
observed overseas and labeled as symptoms 
are not unique to the overseas situation but 
are behavioral deviations of long standing 
which had an adverse effect on ratings of 
military aptitude made by a man’s associates 
and instructors during training. 

Of the two training level psychologist 
evaluations presented in Table 6, only one— 
the adjustment rating—can be considered a 
pure measure of the clinical judgment of psy- 
chologists made during training. The second 
measure——the adjustment group—contains the 
psychologists’ judgment as a major com- 
ponent, but represents the synthesis of an 
adaptability criterion by a second psycholo- 


129 


gist from the ratings made by the first to- 
gether with other information, such as mili- 
tary aptitude ratings, number of demerits, 
flying grades, and so on, available at the time 
each subject completed the training program. 

None of the correlations of the adjustment 
rating with the combat criteria is significant 
beyond the .10 level, nor do the signs of the 
coefficients permit any interpretation. The 
only conclusion possible from these data is 
that purely clinical evaluations of men made 
at the end of flight training by psychologists 
and summarized in a rating of this type are 
not related to subsequent combat estimates 
of adjustment and effectiveness. 

On the other hand, when the psychologist’s 
clinical evaluations are utilized in construct- 
ing an adaptability criterion which also takes 
into account other training information avail- 
able on each subject, statistically significant 
relationships in the expected positive direc- 
tion * are obtained with the combat criteria. 
Of the 11 adjustment group correlations, 3 
are significant beyond the .05 level and 1 be- 
yond the .10 level. All of these relationships 
are with variables derived from peer-superior 
ratings or evaluations by the overseas psy- 
chologists. 

It may be tentatively concluded from this 
analysis that clinical psychological evaluations 
using only clinical tests and interviews do not 
predict later criteria of adjustment and _ per- 
formance similar to the ones collected in this 
research. However, when the clinical psy- 
chological evaluation is considered together 
with other information, the composite nu 
merical index does predict later adjustment 
and performance estimates which are also 
based on personal evaluations. 


Summary 


This investigation has shown, first, that 
ratings of combat performance and adjust- 
ment made in the field by peers, superiors, 
and psychologists are related to each other 
ana to objective data collected at the same 
time; second, that combat criterion meas- 
ures are predictable by precombat criteria of 

*Since Adjustment Group ratings range from 1 
(high) to 4 (low), positive relationships would be 
represented by correlation coefficients with minus 
signs 
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adjustment and performance; and _ finally, 
that a complex personality dimension called 
likability seems to be one of the most endur- 
ing characteristics of individuals. 


Received July 24, 1956. 
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This paper is a report on certain person- 
ality characteristics found in a group of un- 
ion business agents,’ and a discussion of their 
possible implications, in terms of both the na- 
ture of the business agent’s role and the pos- 
sibility of predicting success of individuals in 
the business agent status. More specifically, 
some characteristic demands of the business 
agent role will be described first, and an at- 
tempt then will be made to relate these to: 
(a) personality variables representative of the 
group and (6) personality differences that 
discriminate between those business agents 
who were rated highest and those who were 
rated lowest by their peers. 


Procedures 


This study was conducted on a group of 21 union 
business agents operating within a joint board or- 
ganization.* Personality characteristics were meas- 
ured by means of the Minnesota Multiphasic Per- 
sonality Inventory (4). A measure of intelligence 
was derived from the Wesman PTI (9). Data on 
age and formal educational level were obtained from 
union records. 

Information pertaining to role, here defined in its 
simplest sense as the generally accepted pattern of 
rights and duties attached to a status, was obtained 
through open-ended interviews with each agent, 
logs of time allotment * kept by each agent, and ob- 
servation of the business agents’ activities. The two- 
hour interviews, conducted by the authors, probed 
the following areas: over-all job, relations with man- 
agement, relations with shop stewards and union offi- 
cers, relations with rank and file union members, 
relations with other business agents on the staff, con- 
tract negotiations, grievance handling, union politi- 
cal action, shop and local meetings, and organizing 


' A union business agent is a full-time, paid union 
official, working for a local or regional union, who 
is engaged in representing the union membership to 
management 

* A joint board organization consists of a number 
of local unions of a given international union joined 
together into a single administrative unit. This or- 
ganization is formally governed by delegates elected 
on a proportionate basis from each local union 

’ Business agents were requested to record the 
amount of time they spent each day on each of the 
various phases of their work. This work log was 
kept over a two-week period 
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Shorthand notes of the interviews were taken while 
they were in progress 

Role success was defined in terms of peer evalua- 
tions. Each agent listed the two peers whom he 
considered to be doing the best job and the two 
whom he considered to be doing the poorest job in 
each of the 11 areas covered in the interviews. A 
total score (covering all areas) was derived for each 
agent rated by counting the positive evaluations as 
plus and the negative evaluations as minus and add- 
ing them. Equal weightings for each area were as- 
sumed in lieu of any contradictory evidence. The 
total scores were then ranked for the group. 

To relate personality to role success, averages in 
terms of the MMPI scales were derived for the 
four most highly rated agents and for the four who 
received the lowest ratings. Because the sample was 
too small for meaningful statistical manipulation, 
differences between groups were inspectionally deter- 
mined. 


Results and Discussion: Some Significant 
Role Demands 


One of the most striking characteristics of 
the business agent role was the heavy demand 
made upon time and energy. To fulfill this 
demand, the business agents could not be 
confined to an eight-hour day and a five-day 
week. They were expected to spend daylight 
hours primarily in dealing with management 
in behalf of the membership, i.e., collective 
bargaining and centract negotiations, griev- 
ance handling, and arbitration. The working 
“day” also was expected to include the prepa- 
ration of union reports and contracts, inter 
union meetings, and staff meetings. At the 
end of the work “day,” the second phase of 
the business agents’ expected activities began: 
formal meetings with the membership in shop 
and local, joint 
tional drive 


board meetings, organiza- 
meetings, and informal, but 


‘It was assumed that success in a role meaning 
fully must be considered in terms of the evaluation 
of those who control entrance into and departure 
from the status, ie., those upon whom maintenance 
of the status ultimately depends. In the union 
studied, entrance into and departure from the busi 
ness agent status were controlled in fact, although 


not in formal procedure, by the business agent group 
(5) 
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highly necessary, social interaction with the 
membership. Duties relative to civic affairs, 
and legislative and political duties were also 
expected to be fitted into the already tight 
schedule. ) 

As a result of these time demands, oppor- 
tunities for activities beyond the union sphere 
(home, family, friends, etc.) were severely 
limited during the work week and sometimes 
during the weekends as well. The weekends 
provided the business agents with a chance to 
“catch up” or sometimes were filled with 
meetings that either could not be fitted into 
the five-day week or were of an emergency 
nature. The job demanded all, and any re- 
luctance to give unsparingly was viewed criti- 
cally by others in the organization. Conse- 
quently, considerable psychological, as well as 
physical, isolation from family and friends 
occurred. The lack of interest in, or under- 
standing of, the business agents’ problems 
shown by these groups also enhanced the 
agents’ isolation. The psychological and 
physical anchoring points for the union busi- 
ness agents had to be, and literally were, in 
the job. 

Another pertinent aspect of the role was 
The business agents 


its problem orientation, 
were constantly faced with problems of others. 
In so being, they were faced with problems of 


their own. Maintenance of their own status 
was, in large part, dependent upon success- 
fully coping with and resolving others’ prob- 
lems. They could not afford to put problems 
out of their minds. They were destined to be 
problem solvers, who, obviously, often could 
not please everyone involved, and traditional 
union structure and ideology also required 
that the business agents must be accessible 
to the disgruntled and discontent (2). 
Although they were operating under con- 
stant tension, they could not afford to show 
it. In all but exceptional situations, they 
were expected to keep their tempers and be 
diplomatic with management, to keep a re- 
spectfully factual attitude with arbitrators, 
to maintain a sympathetic attitude toward 
the rank and file, to support and teach their 
subordinate union officers, and to be help- 
fully constructive with their peers. They 
were expected to approach problems with en- 
thusiasm, and to devote untold energy to each 
phase of their work. They had to be tire- 
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less, ever-enthusiastic, emotionally involved, 
yet intellectually calm. 

Still another aspect of the role of the busi- 
ness agent was the need for both caution and 
skepticism. In dealings with management, 
the business agents had their best arguments 
in terms of intangibles, i.e., membership soli- 
darity and the potential aggressiveness of the 
rank and file toward the employers. Only 
rarely could the business agents “lay their 
cards on the table” and share reactions and 
facts with their antagonists. They had to 
sell their power potential, which was based 
on their control of the membership. Man- 
agement’s perceptions of the union’s bond 
with the employee-member and the business 
agents’ expressed appraisal of membership 
attitudes and motivations were often the de- 
termining factors in the business agents’ suc- 
cesses in bargaining. In the verbal exchange 
that characterizes nonviolent relationships 
with management, there had to be a con- 
stant and critical evaluation of word against 
fact—an awareness that logic and talk can 
be, and often are, misleading. “Getting the 
bottom apple out of the barrel” could not be 
accomplished through a naive and trusting 
attitude. 

With the membership, too, caution and 
skepticism were vital. The business agents 
knew that some members were potential com- 
munication channels to the employer. Conse- 
quently, restriction of information became a 
tactical necessity—a necessity that even the 
loyal unionist sometimes did not quite ap- 
prehend and often criticized. The business 
agents had to be cautious in other relations 
with their membership, as well. They had 
learned that they could not afford to ascribe 
to a “member right or wrong” philosophy. 
They were aware that members, ego involved 
and threatened by management censure, often 
reported grievances as they wished they had 
happened rather than as they occurred. A 
misstated grievance carried to arbitration 
might act as a precedent, which would be, 
disadvantageous to the union or might cause 
the loss of personal and union stature with 
labor board, arbitrator, and management. 
The business agents had to learn to doubt. 

The business agents, then, were operating: 
in an environment that demanded their time 
and energy to the exclusion of normal social 





Personality and Role in a Union Business Agent Group 


Table 1 
Breakdown of the Business Agent Staff 


in Terms of Age 


Number of 
Agents 





intercourse with other groups. They were 
constantly faced with problems of others and 
were responsible for taking over such burdens. 
They were confronted with situations that 
negated the advisability of being open, of 
sharing, or of taking the word of either an- 
tagonist or protagonist. They were always 
expected to be ready to take on new responsi- 
bilities with enthusiasm and grace, and to 
counter aggression and hostility in others 
with tact and diplomacy. They lived and 
worked for and with others, but could not 
share with them. 

Personal characteristics. In presenting per- 
sonal data, only material relevant to the 
group will be discussed. Although there 
might be some merit in exploring and dis- 
cussing the characteristics of each member of 
the group, the obvious requirement of ano- 
nymity negates this possibility. 

As Table 1 indicates, the majority of the 
agents were beyond their physical prime. 
Their average age was 50.3 years. The 
youngest was 35 and the oldest 63. One 
half of the business agents had been on their 
jobs over 10 years (see Table 2) and their 
average length of service was 10.7 years. 


Table 2 
Breakdown of the Business Agent Staff 
in Terms of Length of Service 


Number of 
Agents 


Length of 
Service 


1-4 
5-9 
10-14 
15-19 
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The newest members on the staff had worked 
three years for the joint board and the oldest, 
20 years. A rank difference correlation of 
+ .305 indicates little relationship between 
age and length of service for the group. Al- 
though these men carried out their duties 
with apparent vigor, the vast majority of 
them carried physical scars in terms of hy- 
pertension, heart trouble, ulcers, and colitus, 
The greatest number consumed a considerable 
amount of alcohol, and a few drank enough 
at times so that it interfered with their work. 
A number professed concern over their men- 
tal well-being, although this was not a trend 
for the group. 

Their formal educational level was low 
relative to that of their management counter- 
parts (8), and approximated that of their 
rank and file membership (6). Five business 
agents had a high school education; a few 
had little education beyond the “school of 
hard knocks.’”’ Their formal education was, 
however, in no way representative of their 
intellectual capacities. On  factory-worker 
norms (9), the group averaged one sigma 
above the mean (84 percentile). Only two 
agents scored at 50 percentile or below, 
whereas one-quarter of the agents were not 
provided with an adequate ceiling in the test- 
ing instrument used. 


The Psychological Profile Part I 


The MMPI results, as indicated in Fig. 1, 
bear an interesting relationship to the de- 
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Fic. 1. Minnesota Multiphasic Personality Inven- 
tory Profile, K-corrected, designating averages for 21 
union business agents. 
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mands of the business agent role. Although 
it is difficult to determine cause and effect 
between role and personality, it is quite evi- 
dent, even from a cursory analysis, that a re- 
lationship existed. 

As was previously discussed, the constant 
tensions involved in performing the role and 
the lack of opportunity to share problems 
with family and friends, or for that matter 
with the people by whom the business agents 
were constantly surrounded, seemed to re- 
sult in many psychosomatic complaints. The 
tendencies toward such disorders were indi- 
cated by the elevation in the MMPI cate- 
gories of hypochondriasis (//s) and con- 
version hysteria (//y), which indicated a 
propensity for “over’’-concern with physical 
well-being and for reaction to emotional 
stresses and strains in terms of physical 
symptomatology. Whether it was the role 


demands that created such personality trends 
or whether it was a function of the basic per- 
sonality characteristics being “suited” to the 
job is a moot question. 

The elevation of the group on the psycho- 
pathic deviate (Pd) scale also bears note. 
Such factors as have just been mentioned-— 


isolation from family and friends and work- 
ing under considerable tension—seem to be 
relevant here, as well as with respect to the 
Hy and Hs scales. Under such circumstances, 
release for the business agents could not be 
normally channeled, but took somewhat de- 
viant courses. Moreover, particularly among 
the longer tenure agents, early union leader- 
ship had demanded deviancy. In the early 
days of unionism, most leaders would have 
had little compunction about going to jail or 
using socially unacceptable means to further 
the organization. The Pd scale indicated 
that the business agents, as a group, tended 
to be personable but to have little emotional 
depth, and to have not internalized the so- 
cietal norms. 

The elevation on the hypomania (Ma) 
scale, indicative of a tendency toward over- 
activity and enthusiasm, also seemed to be 
related to the role demands. These men were 
driven, not toward self-advancement, but 
rather toward organizational advancement, 
which because of their firm ego involvement 
in their jobs provided a powerful motivating 
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force. They were believers in hard work and 
accepted and emphasized this as part of their 
role demands. They had to be psychologi- 
cally prepared to cope with the “impossible” 
and to come through with “success.” It 
would seem that only persons with a hypo- 
manic tendency could cope with such role 
demands successfully. 

Although none of the group tendencies 
mentioned above was of clear clinical import, 
they were more obvious than the remaining 
elements of the MMPI. The latter scales 
showed only moderate deviancy for the group 
as a whole. 

Any interpretation of the group elevation 
on the ego-strength (Es) scale must be ex- 
ceedingly tentative, due to the comparative 
newness of the scale and the consequent lack 
of adequate norms (3). It would seem, how- 
ever, that, taking the scale name literally, the 
tendency to bounce back from problems and 
the tendency to adjust to trauma easily were 
also compatible with the role demands. “You 
may lose the first time, and the second, but 
you can’t let it get you down,” and “You 
have to be self-confident, able to take things 
in your stride,” were role expectations that 
seemed related to this personality charac- 
teristic. 

A slight elevation on the depression (D) 
scale could, for these job-oriented business 
agents, be a reflection of fear of failure in an 
environment constantly characterized by pres- 
sure of decision making. 

The slight elevation on the paranoid (Pa) 
scale and the psychasthenia (Pt) scale also 
seemed to be related to the role demands. 
The role, as has been discussed, called for a 
good deal of caution and systematic working 
out of details, as well as a covert mistrust of 
others’ motivations, particularly in bargain- 
ing and grievance handling situations where 
this mistrust had been pragmatically justified. 

The slight rise on the schizophrenia (Sc) 
scale was harder to explain. Bizarre behav- 
ior of any kind and, certainly, asocial or 
withdrawal behavior, would not be in accord 
with the responsible, public nature of the role. 
It might be hypothesized, however, that a 
certain amount of inner detachment was nec- 
essary for individuals who were as constantly 
problem-oriented as the business agents. 
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The Psychological Profile Part II 


Turning to the relationship between per- 
sonality and role success, Fig. 2 provides a 
basis for further hypotheses concerning the 
role. 

Depression (D) seemed to distinguish be- 
tween the high- and low-rated agents, the 
lows showing considerably more depression. 
These data suggest a number of hypotheses. 
First, depression might have been a reflec- 
tion of failure to live up to the role demands; 
i.e., a “successful” business agent, in the 
course of his job, lived up to the role expec- 
tations and was aware that he had, whereas 
for the “unsuccessful” agent the reverse 
would have been true. The validity of such 
a hypothesis, of course, would depend on the 
degree to which the business agent’s peer 
group was also his reference group. An ear- 
lier analysis (5) provides some evidence that 
this was so. Second, anxiety about the job 
and/or the effect of the job pressures upon 
the individual might have lessened job effec- 
tiveness. As was noted earlier, taking things 
in one’s stride seemed to be an important ex- 
pectation related to role success. 

The paranoid (Pa) scale also seemed to 
differentiate between high- and _ low-rated 
business agents. In terms of the scale, the 
high-rated agents showed a greater tendency 
to doubt and question the motivations be- 
hind the ideas and proposals of others. As 
previously discussed, this personality charac- 
teristic seemed related to the role. Two 
proscriptions applying to the role were con- 
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Fic. 2. A comparison of high-rated and low-rated 
business agents in terms of the Minnesota Multi- 
phasic Personality Inventory (K-corrected). 
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tinually getting “out on a limb” in a prece- 
dent-setting grievance and repeatedly accept- 
ing management’s arguments to the degree 
that the membership suffered. In dealing 
with both the rank and file and management, 
the error of too easily accepting arguments 
as valid was fraught with danger. Some ag- 
gressive suspicion as to the motivation of 
others appeared to be mandatory for role 
success. 

One implication of Pa scale elevation is 
that suspicion of others’ behavior is based 
upon an assumption that such behavior may 
be a threat to self. When one considers, 
however, the constant, conscious or uncon- 
scious, manipulation of the business agent by 
both management and members, it seems 
plausible that this suspicion might have had 
a genesis within the role. On the other hand, 
the man who was not “taken in” was prob- 
ably favored in the selection of leaders. 

The last of the apparently discriminating 
scales was ego strength (Es). If it is valid 
to construe elevation on this scale as indica- 
tive of inner resources, this characteristic 
would seem to have a direct bearing on role 
success as perceived by the business agent 
group. As has been noted, the business 
agents’ role was a complex one, filled with 
frustrations, a role with many overlapping 
duties. Ability to carry on in the face of 
failure seemed to be of vital importance. As 
was mentioned, self-confidence and ability to 
weather the storm were expectations of the 
group toward the role. ‘ 

Implications. The exploratory nature of 
this study and the small number of indi- 
viduals examined make it impossible to ar- 
rive at firm generalizations from the data pre- 
sented. Some hypotheses are suggested, how- 
ever, that could provide a starting point for 
further research. 

To the degree that the role demands dis- 
cussed here are widespread concomitants of 
the business agent status, first, it can be 
hypothesized that personality characteristics 
comparable to those found in the group 
studied may be found for individuals hold- 
ing like statuses throughout the labor move- 
ment. Second, it can be hypothesized that 
throughout the labor movement business 
agents judged to be successful by their peers 
may be distinguished from their lower rated 
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co-workers by personality factors comparable 
to those that distinguished between high- and 
low-rated agents in this study. 

Further, it can be hypothesized that per- 
sonality characteristics similar to those typi- 
cal of the group studied may be found in any 
role involving demands comparable to those 
discussed. There is some indication in the 
literature that a case in point may be the 
business executive. 

From the studies reported concerning the 
role of the business executive, it seems clear 
that he, like the business agent, is in a time- 
demanding occupation, is job-oriented to the 
exclusion of family (10), is constantly sur- 
rounded by stressful problems, and is sub- 
ject to a high incidence of psychosomatic dis- 
orders (7). The tremendous need for energy 
expenditure and enthusiasm also is noted 
(10). Argyris’ study of successful execu- 
tives indicates further expectations directed 
toward the status that seem comparable: 
ability to work under frustrating situations 
without “blowing up”; accepting loss or de- 
feat without shattering of the personality; 
accepting hostility from others without any 
overt indication of hurt; expressing hostility 
tactfully, rather than letting it run wild (1). 

The apparent ability of the MMPI to dis- 
tinguish a business agent population from 
the general population, as well as its ability 
to distinguish between high- and low-rated 
agents, and the possibility that personality 
variables characterizing the business agent 
group may have parallels in other adminis- 
trative statuses, such as the business execu- 
tive, may provide a basis for insight and pre- 
diction into the occupational classification of 
administrator. For many years, the deter- 
mination of factors entering into the role suc- 
cess of leaders,® including administrators, has 
puzzled the social scientist. There has been 
a growing contention that the trait approach 
is nongeneralizable, due to the heterogeneity 

5 The term “leadership” is purposely not used in 
this discussion because of the difficulty of finding 
agreement among scholars either as to role responsi- 


bilities or as to necessary requirements to fill ade- 
quately the role demands. “Administrators” is used 


in a very broad sense to designate those individuals 
who manage or direct the execution, application, or 
conduct of institutional policy via the subordinates 
in the organizational structure. 
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of group and institutional demands re the ad- 
ministrator-leader. It appears possible, how- 
ever, that this conclusion was premature on 
two counts: First, that the measurement 
techniques utilized tended to emphasize the 
“simple” trait measurement which had proved 
to be so successful with the manual worker; 
and second, that there was little attempt to 
find comparability among administrator-lead- 
ers whose role demands appeared to have 
some overlap. 

It seems reasonable that, if and when role 
demands attached to a status are similar, in- 
dividuals sharing characteristics in common 
are apt to excel regardless of the organiza- 
tional framework. This study provides some 
indications that personality data, such as 
those made available by using the MMPI, 
may be useful in giving definition to some of 
the qualities necessary to fulfill role demands 
satisfactorily. If further research bears this 
out, it may be possible to develop an “upper 
echelon” job-family and thereby gain greater 
comprehension of this important aspect of 
our society, as well as to provide better selec- 
tion techniques for such crucial positions. 


Received July 26, 1956. 
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